Create Pandas Dataframe on Databricks

Workaround to read csv from DBFS using pandas

Vasav

1 minute read

I had some issue reading a csv directly on the databricks community edition. So after going through some articles, I finally found the workaround. Databricks has disabled to use csv directly for pandas as you may encounter FileNotFoundError: [Errno 2] No such file or directory:.

How Apache Spark works

Spark Architecture

Vasav

4 minute read

Apache spark is a distributed compute engine used to process large volume/amount of data. In this article I am going to provide information on how it works behind the scenes.