Pandas Datetime
Manipulating datetime columns in pandas
This post explains how to work with date and time in pandas. Date and time are very common for a dataset to have. Based on the use case, the column should be transformed.
Manipulating datetime columns in pandas
This post explains how to work with date and time in pandas. Date and time are very common for a dataset to have. Based on the use case, the column should be transformed.
Pyspark supported data sources
Spark support various data sources. Spark has some core data sources built into it while the others are available and maintained by other developers from the community. In this post, I am going to explain the core data sources supported by pyspark.
Combine multiple csv files in python
import os
import glob
import pandas as pd
path = os.getcwd()
extension = 'csv'
csv_files = glob.glob('*.{}'.format(extension))
df_list = []
for file in csv_files:
df = pd.read_csv(file)
df_list.append(df)
pd.concat(df_list).to_csv("combined_file.csv", index=False)
Note: In order to perform the same with excel, change the value of extension and use read_excel
method instead read_csv
.