Key Responsibilities : AShould be conversant with Apache Spark architecture RDDs various transformations and actions spark configuration and tuning techniques BKnowledge of Hadoop architecture execution engines frameworks applications tools CPyspark using Spark MLlib library DExposure to Data warehousing concepts methods
Technical Experience : AShould have Excellent development experience with Python producing data applications BShould have 8 years of experience using PySpark with Spark RDDs Spark SQL DataFrames CShould have Experience in AWS Sagemaker and AWS Glue D Should have Experience in data wrangling and data analysis with Pandas and Numpy
Professional Attributes : A Should have good communication and analytical skills B Team player
Educational Qualification : Graduate
Skills:- PySpark, Data Warehouse (DWH), SQL and Amazon Web Services (AWS)