Mapping

Allen Li
1 min readMay 16, 2021

--

Create variable centered_price containing a version of the price column with the mean price subtracted.

Ctrl + Alt + 6
import pandas as pd
import numpy as np
a = df.price.mean()
centered_price = df.price.apply(lambda x : x - a)
=centered_price = df.price - df.price.mean()

Create a variable bargain_wine with the title of the wine with the highest points-to-price ratio in the dataset.

bargain_idx = (df.points / df.price).idxmax()
bargain_wine = df.loc[bargain_idx, 'title']

Words count

n_trop = df.description.map(lambda desc: "tropical" in desc).sum()
n_fruity = df.description.map(lambda desc: "fruity" in desc).sum()
descriptor_counts = pd.Series([n_trop, n_fruity], index=['tropical', 'fruity'])

Create a series star_ratings with the number of stars corresponding to each review in the dataset.

def stars(row):
if row.country == 'Canada':
return 3
elif row.points >= 95:
return 3
elif row.points >= 85:
return 2
else:
return 1
star_ratings = df.apply(stars, axis='columns')

To add a new columns with multiple condition.

df['mean'] = df.mean(axis=1)
condition = [(df['mean'] > 80),
(df['mean'] <90) & (df['mean'] >60)]
result = ['A', 'B']
df['Grade'] = np.select(condition, result)
#other
df['Pass/Fail'] = np.where(df['mean'] > 60, 'pass', 'fail')
=
df['Pass/Fail'] = ['pass' if x > 60 else 'fail' for x in df['mean']]

--

--

Allen Li
Allen Li

Written by Allen Li

Hi, I am an auditor in retail industry and here is a book to record the path that I have been through on studying things such as programing and exercising.

No responses yet