Python Data Science Select from Pandas DataFrame

Practical Data Science with Python – Selecting Data from a DataFrame

The task:

Emulate SQL SELECT with a pandas dataframe

E.g.

SELECT *
FROM table
WHERE column_name = some_value;

The code in this lesson was written in a Juypter Notebook. This means it can be run sequentially using IPython. If you wish to use non-interactive Python you can create a .py file and run it as you normally would, omitting any special directives such as %load_ext nb_black.

# Optional auto-formatting. Installation required (`pip install nb_black`)
%load_ext nb_black

# Import Pandas
import pandas as pd

UK Best Selling Cars of 2021 up to February 5th

Source: https://www.autoexpress.co.uk/news/94280/best-selling-cars-2021

# Create a dictionary containing our data
cars = {
    "Model": [
        "Vauxhall Corsa",
        "Kia Sportage",
        "Nissan Qashqai",
        "Ford Fiesta",
        "Volvo XC40",
    ],
    "Sales": [3078, 2986, 2835, 2594, 2201],
}

# Create a pandas dataframe from the data in `cars`
df = pd.DataFrame(cars, columns=["Model", "Sales"])

# Display the dataframe
print(df)

            Model  Sales
0  Vauxhall Corsa   3078
1    Kia Sportage   2986
2  Nissan Qashqai   2835
3     Ford Fiesta   2594
4      Volvo XC40   2201

Practice some `SELECT` statement equivalents

# Use df.loc for label-based indexing
df.loc[df["Model"] == "Kia Sportage"]

	Model	Sales
1	Kia Sportage	2986

df.loc[df["Sales"] > 2500]

	Model	Sales
0	Vauxhall Corsa	3078
1	Kia Sportage	2986
2	Nissan Qashqai	2835
3	Ford Fiesta	2594

# Search for a string
df.loc[df["Model"].str.contains("V")]

	Model	Sales
0	Vauxhall Corsa	3078
4	Volvo XC40	2201

# Multiple conditions
# Use `&`, `|`, `~` for and , or, not
# Parentheses required due to precedence of logic and comparision operators
df.loc[(df["Model"].str.contains("V")) | (df["Sales"] == 2986)]

	Model	Sales
0	Vauxhall Corsa	3078
1	Kia Sportage	2986
4	Volvo XC40	2201

That is how you can query pandas dataframes in a similar way to using SQL SELECT statements.

Happy Computing!

Tags: data science pandas python

Compucademy

Practical Data Science with Python – Selecting Data from a DataFrame

The task:

UK Best Selling Cars of 2021 up to February 5th

Practice some `SELECT` statement equivalents

Leave a Reply Cancel reply

Compucademy

Practical Data Science with Python – Selecting Data from a DataFrame

The task:

UK Best Selling Cars of 2021 up to February 5th

Practice some SELECT statement equivalents

Leave a Reply Cancel reply

Practice some `SELECT` statement equivalents