We turn the Sword Spotlight on Sam Omidi, who recently joined the Energy team as a Data Scientist. Here he talks us through a little background and reflections on his first few months with Sword.
“I think the most surprising thing I’ve learned is that I’m not a fake data scientist among real ones. They call them unicorns for a reason. I can now say with confidence that I can code and I do know software development lifecycle and statistics. Without these skills, life as a data scientist would be difficult and one would struggle a lot. In general, people have their strengths and experience in one or two things and are weaker at others. I now believe a data scientist needs to know more than just a couple of things. They need to know how to handle out-of-memory errors, data structures, and expectation maximization or where heteroscedasticity is coming from.
“I switched into a focus on data science and machine learning during my final year project in my bachelor degree by working on a data mining project and social network analysis. This interest truly materialised when I got Masters in Data Science.
“I joined Sword three months ago as I found the team very passionate about their work. Also, the projects they are working on are very unique and exciting.
“I got heavily involved in my first week with designing a data pipeline and evaluating various novelty detection algorithm, along with my lead. It has been absolutely fantastic seeing myself hitting the ground running from the start and feeling part of the team immediately.
“Additionally, in the subsequent projects I had a chance to boost my knowledge about neural networks and focus my research on them, developing more in-depth understanding.
“A typical day for me starts around 8:00am, when I catch up on my social media accounts related to machine learning and data science. I switch into work projects around 8:30am and finish around 5:00pm with a break for lunch. About 40% of my time is spent on research and development, with a strong focus in mathematics to be able to understand what is actually happening under the hood of every algorithm that might be a good candidate for our project. The work involves anything from developing and testing new algorithms to writing mathematical proofs to simplify data problems.
“Another 30% of my time is spent on research about best coding practice in data science, which often identifies problems related to data capture, data pipelines or specific implementation of an algorithm with needed modifications. This is probably one of the most crucial aspects of the job. As a data scientist, it is important to communicate with a wide range of stakeholders, so it helps simplify my explanations of machine learning algorithms to a layman's level.
“The other 20% of my time is meeting with my colleagues. We discuss the projects and support each other on our tasks.
“Each project is unique, and I try to let the project and its initial findings guide me to next steps. I mainly use Python and Azure platforms for projects, though R, and SAS are occasionally helpful with specific packages or R&D requests. I can usually recycle the code, but each problem has its own assumptions and data limitations with respect to the mathematics.
“Overall, it has been a great start and I am looking forward to what the future will hold for me and the team.