Data Scientist
caterpillar
Job Description
Key job responsibilities include:
- Data mining using state-of-the-art methods
- Enhancing data collection procedures to include information that is relevant for building analytic systems
- Processing, cleansing, and verifying the integrity of data used for analysis
- Doing ad-hoc analysis and presenting results in a clear manner
- Use statistical research and analysis to understand which approach and algorithm to use
- Creating automated anomaly detection systems and constant tracking of its performance
- Selecting features, building and optimizing classifiers using machine learning techniques
- Build predictive models and machine-learning algorithms
- Combine models through ensemble modeling
- Responsible for defining and documenting architecture, capturing and documenting non-functional (architectural) requirements, preparing estimates and defining technical solutions to proposals (RFPs)
- provide technical leadership to project team to perform design to deployment related activities, provide guidance, perform reviews, prevent and resolve technical issues
- Analyse complex business and competitive issues and discerns the implications for systems support. Identifies, defines, directs and preforms analysis of technical and economic feasibility of proposed data solution.
- Creates and manages a machine learning pipeline, from raw data acquisitions to merging and normalizing to sophisticated feature engineering development to model execution.
- Exhibit advanced visualization skills, as well as creative problem-solving
- Interacting with customers to have an in-depth understanding of their operations to improve their processes for managing equipment and interfacing with Caterpillar.
- Providing project leadership, advice, and counsel to developers, management, customers and project teams on the most complex aspects of application development and system integration.
Qualifications
Basic (Required) Qualifications:
- Strong analytical skills.
- Strong math skills (e.g. statistics, algebra)
- Problem-solving aptitude
- A Bachelor’s Degree from an accredited college or university or equivalent years (4 years) of experience relevant to this role.
- Five or more years of progressively responsible role relevant experiences that demonstrate both breadth of business knowledge and depth of digital and analytical skillsets.
- Curiosity about and a deep interest in how digital technology and systems are powering the way users do their jobs.
- Comfortable working in a dynamic environment where digital is still evolving as a core offering.
- Ability to clearly and succinctly explain complex topics.
- Expert in Python; familiar with concurrency in Python
- Proficient with git or similar source control system; and Experienced with git-based development workflows
- Ability to design and consume well-tailored REST APIs
- Experience working in development teams with code reviews and varying levels of seniority
- Experience both in jumping into an existing architecture, and starting projects from scratch
- Proven ability to take initiative and dive into areas of new technology
- Self-motivated with a passion for learning, analyzing technology tradeoffs, and shipping product
- Good applied statistics skills, such as distributions, statistical testing, regression, etc.
- Proficiency in using query languages such as SQL
- Good scripting and programming skills
- Data-oriented personality