
In geometry, collinearity of a set of points is the property of their lying on a single line. A set of points with this property is said to be collinear. In this post, I’ll illustrate how you determine collinear points using Apache Spark. This is one of the programming labs of DSE230x Big Data Analytics with Apache Spark.e(filename)
The entire project code is embedded below along with markdowns for some explanations. I use jupyter notebooks for my data science projects and hence, I am also using the pyspark environment in Jupyter NBs itself. You can also open the below notebook in Google Colab
Here is the GitHub repository which has two sample data sets to test the code
https://github.com/Yashv2211/Collinear-Points-Using-Apache-Spark
That’s all! Thanks for reading!