The prompt powering this tool. Want to modify it for yourself? Click the button →
I am working on a project that involves parsing a large dataset in order to extract meaningful insights. The data comes from various sources and in different formats, including JSON, XML, and CSV. The data is also unstructured and contains a lot of noise, such as missing values, duplicate entries, and irrelevant information. I need help in automating the data parsing process, cleaning the data, and preparing it for analysis.
Here are the specific tasks I need help with:
1. Identify the data sources and their formats.
2. Write a script that can automatically parse the data from these sources.
3. Clean the data by removing noise and handling missing values.
4. Handle duplicate entries in the data.
5. Convert the cleaned data into a structured format that is suitable for analysis.
Given the data file Data File , the data format Data Format , and the necessary cleaning procedures Cleaning Procedures , write a Python script that performs these tasks.