generate functionWed, 29 Mar 2023

# Read the dataframe called "global_dataframe" from a csv file in current directory called "Sum Data.csv". Define "KPI": Grouping the dataframe by column "Volt" and get mean information of columns "NEVs" and "MWhT". Call "KPI" for "global_dataframe" as "gKPI". Calculate "NNewInd" as round the 10% of indexes of "global_dataframe". Calculate "NVEl" as number of different elements in the column "Volts" of "global_dataframe". Define the dataframe "RdataFrame" with columns: - "absGap" - "dG1", "dG2", "dG3",etc. With "NVEI" columns. - "Id1","Id2","Id3", etc. With "NNewInd" columns. Start a loop of 1000 "Sample" in "Samples", per each sample: - Get sub dataframe "SDFrame" componed by 10% of the indexes of "global_dataframe" choosen randomly. Store the list "Indexes" of indexes choosen transposed. - Call "KPI" for "SDFrame" as "sKPI". - Calculate absolute error, "absGap", and detailed error, "detGap", between "sKPI" and "gKPI". - Store "absGap" and "detGap" in the dataframe "RdataFrame" in columns "aGap", "dG1", "dG2", "dG3",etc. - Store "Indexes" in the dataframe "RdataFrame" in the columns "Id1","Id2","Id3",etc.

``````global_dataframe = pd.read_csv("Sum Data.csv", low_memory=False)
KPI = lambda x : x.groupby(['Volt'])['NEVs','MWhT'].mean()
gKPI = KPI(global_dataframe)
NNewInd = int(len(global_dataframe)*0.1)
NVEI = len(global_dataframe['Volt'].unique())
RdataFrame = pd.DataFrame(columns=['absGap']+['dG'+str(i) for i in range(1,NVEI+1)]+['Id'+str(i) for i in range(1,NNewInd+1)])
for i in range(0,1000):
SDFrame = global_dataframe.sample(n=NNewInd)
Indexes = SDFrame.index.tolist()
sKPI = KPI(SDFrame)
absGap = np.abs(sKPI-gKPI).sum().sum()``````