Job Category: Remote
Job Type: Full Time
Skills: Analytics Apache Spark Clustering Data Analysis & Statistics Data extraction Data Integration Data Wrangling ETL & ELT Forecasting Machine Learning MapReduce nosql PostgreSQL Python R SAS & SPSS Tableau Visualization
Qualifications
High School with 0-2 years of relevant experience or commensurate experience.
Responsibilities
- Data Integration: Apply data wrangling tools, including ETL, ELT, and programming languages, to collect and blend data from operational and external systems.
- Data Analysis: Perform data mining, machine learning, and statistical analysis to create predictive and descriptive models that support decision-making and productivity.
- Clustering and Forecasting: Develop segmentation, clustering, and forecasting models to identify trends and insights that can drive business and mission impact.
- Data Extraction and Transformation: Collect and transform both structured and unstructured data, including relational and NoSQL data, using ETL and ELT tools and custom code.
- Data Visualization: Use data discovery and visualization tools such as Tableau and Trifacta to interpret and present findings in a compelling and actionable manner.
- Model Development: Apply machine learning algorithms using COTS tools (e.g., SAS, SPSS, Oracle) and build analytical solutions using Python, R, and relevant libraries (e.g., Python SciKit, R Caret).
- Iterative Methods: Interpret and evaluate the accuracy of results using iterative and agile methods.
- Collaborative Efforts: Work closely with business and data SMEs to prioritize information needs and ensure alignment with organizational goals.
- Maintain Analytical Systems: Integrate analytical solutions with operational systems, ensuring data accuracy and system performance.
Qualifications
- Required Skills
- Clearance: Top Secret/SCI
- Experience in metadata creation and analysis
- Familiarity with data labeling standards
- Experience with Commercial Solutions for Classified (CSfC)
- Knowledge of setup, configuration, and maintenance of Storage Area Networks (SANs)
Certifications
- Relevant certifications in Data Science, Machine Learning, and Programming (e.g., SAS, SPSS, Oracle, Python, R)
- Additional certifications related to Security Clearance (Top Secret/SCI) and relevant industry standards would be advantageous.