- Spark Dataframe Cheat Sheet
- Pandas Cheat Sheet Pdf
- Sql Cheat Sheet Datacamp
- Datacamp Python Cheat Sheet
Apply Now Menu The best data science cheat sheets Data analytics. Whatever your area of development, knowing how to use the most useful functions of the library you're working with is going to make your life a lot easier. Python For Data Science Cheat Sheet Pandas Basics Learn Python for Data Science Interactively at www.DataCamp.com Pandas DataCamp Learn Python for Data Science Interactively Series DataFrame 4 Index 7-5 3 d c b A one-dimensional labeled array a capable of holding any data type Index Columns A two-dimensional labeled data structure with columns. SQL AGGREGATE FUNCTIONS AVG returns the average of a list CREATE TEMPORARY VIEW v AS SELECT c1, c2 FROM t; Create a temporary view WHEN. BEFORE –invoke before the event occurs. AFTER –invoke after the event occurs EVENT. INSERT –invoke for INSERT. UPDATE –invoke for UPDATE. DELETE –invoke for DELETE TRIGGERTYPE. FOR. This PySpark SQL cheat sheet is your handy companion to Apache Spark DataFrames in Python and includes code samples. You'll probably already know about Apache Spark, the fast, general and open-source engine for big data processing; It has built-in modules for streaming, SQL, machine learning and graph processing. Cheat Sheet 5: DataCamp. SQL is a database system used in programming for all kinds of data sets and is extremely scalable. Keep this cheat sheet handy to you! BI and other business applications rely on you being able to use SQL! Pros: Rated ‘E’ for everyone. Easy to read and implement. Cons: None that I can see. Cheat Sheet 6: Pytorch.
Hey Finxters! I have another set of cheat sheets for you! This time, I am going to focus on the more advanced aspects of Python and what you can do with it! As you know Python is a flexible language used in web development, games, and desktop applications. I am not going to waste too much of your time so let’s hop right to it and dive into these more advanced Python cheat sheets!
Cheat Sheet 0: Finxter Full Cheat Sheet Course
Cheat Sheet 1: DataQuest
This cheat sheet is from DataQuest and it shows all of the intermediate Python
Regular expressions, date/time module, and counter. This is one you will want to have pinned to the wall or in your developers binder to keep handy as you work.
Pros: Great for budding Python developers, keep it handy as you work.
Cons: None that I can see.
Cheat Sheet 2: DataCamp
It is important to know how to import data during your career no matter what stage you are at. As an intermediate Pythoner, you should keep this cheat sheet handy when working an entry level job of data entry and developing you own projects.
Pros: Great for learning importing data sets in Python.
Cons: None that I can see.
Cheat Sheet 3: DataCamp
You have to import data and you have to be able to plot it as a visual representation for businesses to understand and use to their benefit. This cheat sheet will help you to learn matplotlib and write some amazing graphical visualizations with Python.
Pros: Great to have for matplotlib development.
Cons: None that I can see.
Cheat Sheet 4: GitHub
This cheat sheet is for Machine learning and one you will want to keep in your developers binder as you work. Machine learning and Python go together like peanut butter and jelly, and Scikit is going to be your best friend. If your developers journey takes you to machine learning then make sure to keep this cheat handy for yourself.
Pros: Scikit is easily learnable with this cheat sheet
Cons: None that I can see.
Cheat Sheet 5: DataCamp
SQL is a database system used in programming for all kinds of data sets and is extremely scalable. Keep this cheat sheet handy to you! BI and other business applications rely on you being able to use SQL!
Pros: Rated ‘E’ for everyone. Easy to read and implement
Cons: None that I can see.
Cheat Sheet 6: Pytorch
This cheat sheet is more a tutorial that will teach you pytorch for deep learning projects. Here you will get hands on practice on pytorch.
Pros: You will get a deep understanding pytorch and how it used
Cons: It is an online tutorial.
Cheat Sheet 7: DataCamp
Yet another from Datacamp!! This one is called SpaCy and allows you to understand the natural text from documents. This is one I have in my development folder and is used for Natural language programming.
Pros: Rated ‘E’ for everyone.
Cons: None that I can see.
Cheat Sheet 8: Ask Python
This cheat sheet is also more a tutorial for you to learn image processing in Python. The best way to learn is to get your hands dirty! Ask Python is good for doing that so you can learn what you need to and boost your skills.
Pros: Rated ‘E’ for everyone.
Cons: None that I can see.
Cheat Sheet 9: TutorialsPoint
This cheat sheet is also a tutorial on learning database access with Python. This is an incredibly important skill when you freelance your skills or end up working for a company at a data entry position.
Pros: Rated ‘E’ for everyone. This tutorial is one I have used myself! It includes code snippets to learn from.
Cons: It is a tutorial, not a cheat sheet to print.
Cheat Sheet 10: FullStack Python
This is also a tutorial for you to learn from. This particular cheat sheet discusses Deployment of web applications in Python!! It has explanations that go into depth with tools, resources and learning checklist which is started off with an introductory on deployment what it is and why it is necessary.
Pros: Rated ‘E’ for everyone. This is important to know if you are a Pythoner in Web development.
Cons: Needs to be bookmarked on your browser.
These are the cheat sheets and tutorials I think you will find helpful as a Pythonista developing in your particular field. As you can see this time, I wanted to really give you a wide berth of cheat sheets that intermediate Pythonista use with their career choices. I hope at least one of these cheat sheets or tutorials is useful to you on your journey! Thank you once again for joining me and I can’t wait to see you again! 😉😉
Related Articles:
Related Posts
Download this 2-page SQL Window Functions Cheat Sheet in PDF or PNG format, print it out, and stick to your desk.
The SQL Window Functions Cheat Sheet provides you with the syntax of window functions, a list of window functions, and examples. You can download this cheat sheet as follows:
Window Functions
Window functions compute their result based on a sliding window frame, a set of rows that are somehow related to the current row.
Aggregate Functions vs. Window Functions
Unlike aggregate functions, window functions do not collapse rows.
Spark Dataframe Cheat Sheet
Syntax
Named Window Definition
PARTITION BY
, ORDER BY
, and window frame definition are all optional.
PARTITION BY
PARTITION BY
divides rows into multiple groups, called partitions, to which the window function is applied.
Default Partition: With no PARTITION BY
clause, the entire result set is the partition.
ORDER BY
ORDER BY specifies the order of rows in each partition to which the window function is applied.
Default ORDER BY: With no ORDER BY
clause, the order of rows within each partition is arbitrary.
Window Frame
A window frame is a set of rows that are somehow related to the current row. The window frame is evaluated separately within each partition.
The bounds can be any of the five options:
UNBOUNDED PRECEDINGi
n PRECEDING
CURRENT ROW
n FOLLOWING
UNBOUNDED FOLLOWING
The lower_bound
must be BEFORE the upper_bound
.
As of 2020, GROUPS
is only supported in PostgreSQL 11 and up.
Abbreviations
Abbreviation | Meaning |
---|---|
UNBOUNDED PRECEDING | BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW |
n PRECEDING | BETWEEN n PRECEDING AND CURRENT ROW |
CURRENT ROW | BETWEEN CURRENT ROW AND CURRENT ROW |
n FOLLOWING | BETWEEN AND CURRENT ROW AND n FOLLOWING |
UNBOUNDED FOLLOWING | BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING |
Default Window Frame
- If
ORDER BY
is specified, then the frame isRANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
. - Without
ORDER BY
, the frame specification isROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
.
Logical Order of Operations in SQL
FROM
,JOIN
WHERE
GROUP BY
- aggregate functions
HAVING
- window functions
SELECT
DISTINCT
UNION
/INTERSECT
/EXCEPT
ORDER BY
OFFSET
LIMIT
/FETCH
/TOP
You can use window functions in SELECT
and ORDER BY
. However, you can't put window functions anywhere in the FROM
, WHERE
, GROUP BY
, or HAVING
clauses.
- Ranking Functions
row_number()
rank()
dense_rank()
- Distribution Functions
percent_rank()
cume_dist()
- Analytic Functions
lead()
lag()
ntile()
first_value()
last_value()
nth_value()
- Aggregate Functions
avg()
count()
max()
min()
sum()
Ranking Functions
row_number()
- unique number for each row within partition, with different numbers for tied valuesrank()
- ranking within partition, with gaps and same ranking for tied valuesdense_rank()
- ranking within partition, with no gaps and same ranking for tied values
ORDER BY and Window Frame:rank()
and dense_rank()
require ORDER BY
, but row_number()
does not require ORDER BY
. Ranking functions do not accept window frame definition (ROWS
, RANGE
, GROUPS
).
Distribution Functions
percent_rank()
- the percentile ranking number of a row—a value in [0, 1] interval: (rank-1) / (total number of rows - 1)cume_dist()
- the cumulative distribution of a value within a group of values, i.e., the number of rows with values less than or equal to the current row’s value divided by the total number of rows; a value in (0, 1] interval
ORDER BY and Window Frame: Distribution functions require ORDER BY
. They do not accept window frame definition (ROWS
, RANGE
, GROUPS
).
Analytic Functions
lead(expr, offset, default)
- the value for the row offset rows after the current; offset and default are optional; default values: offset = 1, default =NULL
lag(expr, offset, default)
- the value for the row offset rows before the current; offset and default are optional; default values: offset = 1, default =NULL
ntile(n)
- divide rows within a partition as equally as possible into n groups, and assign each row its group number.
ORDER BY and Window Frame:ntile()
, lead()
, and lag()
require an ORDER BY
. They do not accept window frame definition (ROWS
, RANGE
, GROUPS
).
first_value(expr)
- the value for the first row within the window framelast_value(expr)
- the value for the last row within the window frame
Pandas Cheat Sheet Pdf
Note: You usually want to use RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
with last_value()
. With the default window frame for ORDER BY
, RANGE UNBOUNDED PRECEDING
, last_value()
returns the value for the current row.
Sql Cheat Sheet Datacamp
nth_value(expr, n)
- the value for the n-th row within the window frame; n must be an integer
ORDER BY and Window Frame:first_value()
, last_value()
, and nth_value()
do not require an ORDER BY
. They accept window frame definition (ROWS
, RANGE
, GROUPS
).
Aggregate Functions
avg(expr)
- average value for rows within the window framecount(expr)
- count of values for rows within the window framemax(expr)
- maximum value within the window framemin(expr)
- minimum value within the window framesum(expr)
- sum of values within the window frame
ORDER BY and Window Frame: Aggregate functions do not require an ORDER BY
. They accept window frame definition (ROWS
, RANGE
, GROUPS
).