Difference between revisions of "Pandas notes"
 (New page: Using Python Pandas package for data analysis. Leo's notes.  == Plot options ==  Define the colors of the plot data  df.plot(color="rgbk")  Different plot types (markers)  df.plot(marker='...)  | 
				 (→Importing and setup)  | 
				||
| (10 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
{{TocRight}}  | 
|||
Using Python Pandas package for data analysis. Leo's notes.  | 
  Using [http://pandas.pydata.org/ Python Pandas package] for data analysis. Leo's notes.  | 
||
== Importing and setup ==  | 
|||
Some useful imports for the examples below:  | 
|||
 import pandas as pd  | 
|||
 import numpy as np  | 
|||
 import matplotlib.pyplot as plt  | 
|||
 import matplotlib  | 
|||
== Data manipulation ==  | 
|||
Import data from a CSV file to a dataframe  | 
|||
 df = pd.read_csv("filename.csv")  | 
|||
Filter the data by a field value  | 
|||
 df2 = df.loc[df['Tagid'] == "1234"]  | 
|||
Get all unique values (for example, for the Tagid field)  | 
|||
 allTags = df.Tagid.unique()  | 
|||
Pivot the table, where the values become columns. For example, to create a plot that uses unique values (of Tagid) as the series.  | 
|||
 dfTags = df.pivot(columns='Tagid', values='RSSI')  | 
|||
Creating a data frame on the run:  | 
|||
 rows = []  | 
|||
 for x in mylist :   | 
|||
    rec = {"Name" : x, "AA" : aa(x), "BB" : bb(x)}  | 
|||
    rows.append(rec)  | 
|||
 df = pd.DataFrame( rows )  | 
|||
Remove a column  | 
|||
 columns = ['Col1', 'Col2', ...]  | 
|||
 df.drop(columns, inplace=True, axis=1)  | 
|||
== Plot options ==  | 
  == Plot options ==  | 
||
Many [https://pandas.pydata.org/pandas-docs/stable/visualization.html visualization examples are here].  | 
|||
Define the colors of the plot data  | 
  Define the colors of the plot data  | 
||
| Line 11: | Line 45: | ||
Limit the X axis values:  | 
  Limit the X axis values:  | 
||
 df.plot(xlim=(0,4000))  | 
   df.plot(xlim=(0,4000))  | 
||
Naming the axis  | 
|||
 ax = df.plot()  | 
|||
 ax.set_ylabel(AntNames[x])  | 
|||
Placement of the legend (Below-left of the plot)  | 
  Placement of the legend (Below-left of the plot)  | 
||
 df.plot().legend(loc='upper left', bbox_to_anchor=(0, 0))  | 
   df.plot().legend(loc='upper left', bbox_to_anchor=(0, 0))  | 
||
Removing the legend  | 
|||
 ax = df.plot()  | 
|||
 ax.legend_.remove()  | 
|||
Plotting means and std: [https://megapteraphile.wordpress.com/2015/11/03/plotting-means-and-stds-with-pandas/ link]  | 
|||
 mymean = byTagAnt.RSSI.mean()  | 
|||
 mystd =  byTagAnt.RSSI.std()  | 
|||
 mymean.plot(kind="bar", yerr=mystd);  | 
|||
Plotting with various parameters:  | 
|||
 p = mymean.plot(figsize=(15,5), legend=False, kind="bar", rot=45, color="green", fontsize=16, yerr=mystd);  | 
|||
 p.set_title("RSSI", fontsize=18);  | 
|||
 p.set_xlabel("Tags", fontsize=18);  | 
|||
 p.set_ylabel("dBm", fontsize=18);  | 
|||
 p.set_ylim(0,-85);  | 
|||
Latest revision as of 00:17, 21 September 2018
Using Python Pandas package for data analysis. Leo's notes.
Importing and setup
Some useful imports for the examples below:
import pandas as pd import numpy as np import matplotlib.pyplot as plt import matplotlib
Data manipulation
Import data from a CSV file to a dataframe
df = pd.read_csv("filename.csv")
Filter the data by a field value
df2 = df.loc[df['Tagid'] == "1234"]
Get all unique values (for example, for the Tagid field)
allTags = df.Tagid.unique()
Pivot the table, where the values become columns. For example, to create a plot that uses unique values (of Tagid) as the series.
dfTags = df.pivot(columns='Tagid', values='RSSI')
Creating a data frame on the run:
rows = []
for x in mylist : 
   rec = {"Name" : x, "AA" : aa(x), "BB" : bb(x)}
   rows.append(rec)
df = pd.DataFrame( rows )
Remove a column
columns = ['Col1', 'Col2', ...] df.drop(columns, inplace=True, axis=1)
Plot options
Many visualization examples are here.
Define the colors of the plot data
df.plot(color="rgbk")
Different plot types (markers)
df.plot(marker='.')
Limit the X axis values:
df.plot(xlim=(0,4000))
Naming the axis
ax = df.plot() ax.set_ylabel(AntNames[x])
Placement of the legend (Below-left of the plot)
df.plot().legend(loc='upper left', bbox_to_anchor=(0, 0))
Removing the legend
ax = df.plot() ax.legend_.remove()
Plotting means and std: link
mymean = byTagAnt.RSSI.mean() mystd = byTagAnt.RSSI.std() mymean.plot(kind="bar", yerr=mystd);
Plotting with various parameters:
p = mymean.plot(figsize=(15,5), legend=False, kind="bar", rot=45, color="green", fontsize=16, yerr=mystd);
p.set_title("RSSI", fontsize=18);
p.set_xlabel("Tags", fontsize=18);
p.set_ylabel("dBm", fontsize=18);
p.set_ylim(0,-85);