Hi, I'm Arshman Khalid

Data Scientist
and Data Analyst
Based In Australia.

About Arshman

As a self-directed and meticulously driven Data Scientist, Arshman Khalid has extensive knowledge in Machine Learning, Software Engineering, and Agile Environments. Throughout his career, his work ethic, proven track record with data analytics projects as well as business intelligence and cloud technologies, brought significant value to the decision making-process of his teams. In 2021, he co-founded CloudLyte, a thriving startup that focuses on the development of IT services to increase efficiency, safety, and quality for its partners and clients.

Arshman’s substantial understanding and grasp of Reporting Technologies, the ICT industry, and his ability to gather requirements using human-centered design and UX principles are a few of the technical qualities behind his comprehensive approaches. From translating business requirements into technical specifications to building and maintaining positive relationships with the different stakeholders involved, Arshman is passionate about utilizing advanced technologies in order to solve business-related issues and challenges.

The missions and projects handled in CSIRO empowered him with hands-on experience in the development of predictive Machine Learning tools and workflows and the management of large datasets, all while being an active member of a multi-disciplinary and often regionally dispersed research team. Before relocating to Australia, Arshman was a Full Stack Developer in Conrad Labs, Pakistan, where he successfully operated in an agile environment to create product and business value. Before joining the workforce, he made sure to bring additional value to his academic studies by developing several cloud-based solutions, dashboards, and advanced algorithms to polish and enhance his skills.

Because he believes in the importance of continuous learning, Arshman constantly monitors the latest research news and developments in Data Science methods and technologies to stay ahead of current trends and upgrade his competence. Before obtaining his Master’s Degree in Data Science from the University of Melbourne in 2022, Arshman received his Bachelor of Software Engineering from the University of Management & Technology (UMT) in Lahore, Pakistan. In order to experience global education, he was fortunate enough to gain two positive Undergraduate Exchange Student Programs, at Tianjin Polytechnic University (China) and Northeastern University of Illinois (USA).

Work & Education

January 2021 - Present


Data Scientist
  • Founded a startup that focuses on the development of IT services to increase efficiency, safety and quality.
  • Participating in all aspects of business development from market research to operations and finance, ensuring that the company's vision is followed.
  • Building and maintaining professional relationships with potential investors and partners.
  • Formulating the company's vision, goals, and objectives; hired and trained new employees.

December 2020 - December 2021


Data Scientist
  • Experience in the development of predictive Machine Learning tools for multi-scale, multivariate data.
  • Developed workflows for exploring, pre-processing, and manipulating large image datasets through Machine Learning applications and programming interfaces.
  • Developed tools (codes) to assimilate hard data on rock composition and rock properties and diverse imaging and micro-analytical data types from multiple sources into training and prediction workflows.
  • Used CSIRO HPC and cloud-based facilities for managing and processing large datasets.
  • Worked effectively as part of a multi-disciplinary, often regionally dispersed research team, to carry out associated tasks under the guidance of more senior Research Scientists / Engineers.
  • July 2019 - January 2020


    Full Stack Developer
    • Handling all verbal and written communications between hosting companies, clients, and vendors.
    • Documented, coached, and elicited business requirements from cross-functional stakeholders by writing user stories and acceptance criteria, resulting in a clearer, detailed, and complete understanding of project deliverables.
    • Operated in an agile environment to deliver product and business value through product backlog prioritization, well-specified user stories, sprint planning, retrospectives, and daily stand-ups.
    • Designed test cases, and conducted over 3 types of testing including regression, integration, performance, and user acceptance testing against the acceptance criteria to verify the client’s needs.
    • Integrated third-party API from a secure server to fetch data, received in JSON format and dumped into Tableau to generate reports and insights.

    March 2020 - June 2022

    University of Melbourne

    Master of Data Science
    • Statistical Machine Learning
    • Statistical Modelling for Data Science
    • Computational Statistics for Data Science
    • Multivariate Statistics for Data Science
    • Statistical Machine Learning
    • Information Visualization
    • The Ethics of Artificial Intelligence
    • Cluster and Cloud Computing
    • Advance Database Systems
    • Deep Learning and Convolutional Neural Network

    September 2015 - September 2019

    University of Management and Technology

    Bachelor of Software Engineering (Hons)
    • Data Structures and Algorithms
    • Object Oriented Programming
    • Analysis of Algorithms
    • Probability and Statistics
    • Discrete Mathematics
    • Differential Equations
    • Software Engineering
    • Software Construction
    • Software Requirement Engineering
    • Software Quality Testing
    • Software Project Management

    July 2018 - December 2018

    Tianjin Polytechnic University

    Exchange Student
    • Machine Learning
    • Big Data Programming
    • Data Mining
    • Artificial Intelligence
    • Database Systems
    • Computer Vision
    • Natural Language Processing


    “It is easy to lie with statistics, It is hard to tell the truth without statistics.”


    • Python
    • R Language
    • SQL
    • C++
    • Java
    • Swift
    • Solidity


    • Alteryx Designer Core
    • SQL Server
    • MySQL
    • C++
    • Couchbase
    • MongoDB
    • Postgres

    Reporting Technologies

    Cloud Technologies

    • AWS
    • Nectar

    Involved in gathering data, massaging it into a tractable form, making it tell its story, and presenting that story


    Here are some of his selected works he have done lately. Feel free to check them out.

    Twitter Data Processing

    Cluster and Cloud Computing

    This project aims at delivering a simple, parallelised application that leverages the power of the University of Melbourne High-Performance Computing(HPC) facility named SPARTAN. Using TwitterGeoProcessor package, a large dataset of geocoded twitter file can be explored and analysed for extracting relevant information such as the number of posts in individual regions and trending hashtags in each one those regions. The package has been implemented on python and designed with the concepts of MPI (Message-Passing Interface) for effectively improving the performance of processing on the HPC environment.

    Analysing Victorian Road Crash Data

    Information Visualization

    Road Safety data is provided by VicRoads for educational and research purposes. This dataset contained fatal and injury crashes recorded on Victorian roads in Australia between July 2013 and March 2019. There were 74,908 observations in the dataset. This data allows users to analyse Victorian fatal and injury crash data based on a range of factors like time, location, conditions, crash type, road user type, object hit etc.

    Multi-Armed Bandits - Yahoo News

    Statistical Machine Learning

    Several advanced level MAB algorithms i.e., Thompson sampling MAB, TS contextual MAB with linear payoffs, TS MABs with fair exposure, SquareCB contextual MAB with a regression oracle were implemented.

    Authorship attribution (Identifing author of a given document)

    Statistical Machine Learning

    The goal of this project was to predict whether an author was responsible for a particular piece of work. As part of the training, I was provided the publish year, keywords, co-authors, and venue associated with each piece of work. It was expected to perform prediction on 2000 test samples and predict whether the author given for a piece of literature was the real author for the work.

    Arshman is an intellectually sound and patient person. He always puts his best foot forward and is quite hardworking. He is always a team player and it's been great collaborating on multiple group projects with him. He has a unique passion for technology and all things data. Cant wait to see the great things he achieves!

    Author image Pooja Wath Data Scientist, AIA

    Get In Touch

    I'm happy to connect, listen and help. Let's work together and build something awesome. Email Me.