ChallengeSource
Automatically read, interpret, and process images of scanned text and handwriting. While we wait for the digital revolution to sweep through all areas of business, we are often tasked with creating novel digital interfaces to analog problems. Scanned and photographed images of forms are easy, yet time consuming for a human to read, transcribe, and redirect. Your intelligent algorithm will replace the human and process the often poorly-scanned images, digitize typeset and handwritten content, check for and flag errors, verify signatures with our database, detect the language, search for embedded features like QR codes, and route the extracted information to the proper internal system.PostFinance
We have an exciting project that automatically analyses news on our customers with an AI in order to help them find the right products. We challenge you to use a new method active learning to fine tune our news analysis models with human user feedback. NordLB
AI based analysis of existing cost calculations for different working steps, identifying and clustering similar work steps from the Excel files and suggesting a historical price range for that work step, including the average price EDAG
Staying on top of competitor’s activities is key to maintaining a competitive advantage and for strategic planning. In today’s digital environment there is an abundance of publically available information on companies and their employees online. The challenge is collecting, filtering, and structuring this data from multiple sources. At Omya, we monitor our competition through a variety of tools but struggle with the time involved and the efficiency in our collection, filtering, and structuring of the information. In this challenge, students will be asked to develop a method to automate selecting articles that should be reported in monthly summary reports. The method developed should incorporate machine learning based on prior data which will be provided.Omya
Personal running equation machine: Runners who like to enter races with the knowledge of being properly prepared are successful runners. This applies to professional athletes and amateurs alike. The difference is, that professional athletes have a supporting crew of trainers, coaches and more important people, who can do the intelligence, analysis and forecast of future performance. Amateurs simply cannot afford this convenience and are reduced to “”believing”” the in the data presented by current platforms, such as Strava and GarminConnect. These forecasts, however, are less than precise and in reality don’t give the athlete the chance to gain enough insight. The issue becomes much more pronounced in trail and mountain running, where the terrain helps to make a prediction close to impossible, unless the athlete knows his personal running equation. The aim of the project is to design an AI that can use the industry standard .FIT file format, which includes many parameters of a recorded run, such as altitude change, running pace (speed), heart rate and many more, which can be used to find a correlation. Unfortunately, there is no software solution to automatically detect change in slope (how steep of a up/down) in a recorded activity, which is required to be able to forecast the runner’s timing in a future race. The primary goal is thus to automatically detect running speed over xy% slope in a .FIT file. A secondary goal would be a solution to input a .GPX file, which includes all details of a future race course and then automatically calculate the time required to reach predefined places along that race course. Training data is provided in .FIT file format, as are possible simple courses in .GPX file format. Participant
Last year’s challenge was superbly helpful in building the brand new recipe recommendation system on Noonli. We used learnings from the challenge for onboarding and making first recommendations. For this year’s challenge, we want to perform an experiment which is called popular near you. This feature aims to recommend recipes that are popular in a given location. We use the location and recipe information from publicly available information such as Twitter. We define such data like this:
– explicit data or user input data (e.g., ratings on a scale of 1 to 5 stars, likes or dislikes, reviews, and product comments) and
– implicit data or behavior data (e.g., viewing an item, adding it to a wish list and the time spent on an article, etc.). In our case, previous recommendations. We are also learning more about the problem. We will bring in the data, which will be publicly available data from Twitter-like platforms or food rating website services (like google search results, etc).
Noonli
In this challenge, you will use ML techniques to determine the feasibility of identifying hidden patterns or insights associated with social skills & functioning among pre-school children diagnosed with Autism Spectrum Disorder. Data is provided from the EU DREAM trial (https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0236939) comparing two interventions (Robot vs Therapist), where their Digital Biomarkers (skeletal movements, head orientation & eye gaze) were collected. Performance of their social skills were accessed by therapy tasks consisting of imitation, joint-attention & turn taking.Participant
It is challenging to determine illness in pets based on behavior alone, in particular with cats. As recommended by veterinarians, the best way to avoid serious health problems with cats is to be aware of their behavior and note alterations in this behavior. AI-based approaches have been demonstrated to predict animal health and wellness based on analysis of their recorded videos and behavioral signs. Therefore my challenge consists of developing a Microsoft Azure based program to determine from the cat’s facial expression if it is sick or in pain. The goal is to build the solution with modern Microsoft Azure Services. Furthermore, the algorithm should be easy to extend with different data for the future – e.g. from the cat’s drinking station or the frequency of the toilet being used. The output is supposed to have an indication on how reliable the output is so the customer can make a decision (vet visit yes or no – e.g. with a percentage of certainty). Participant
Researchers, especially students writing their theses, are often new to their research topic and not familiar with relevant literature. When starting their literature discovery, researchers often use Google Scholar to pick and choose resources based on metrics (number of citations, etc.) or intuition (“I have enough papers”). This is disorganized, inefficient (time vs output), and does not provide research direction. Similar pains are present for those reading or grading the research.
This challenge aims to redefine how literature discovery is done, eliminating (or significantly reducing) the reliance on keyword searches, and getting stuck in local maximums of search clusters. The primary goal of the challenge is to use machine learning to derive hidden relationships and insights (such as clustering distance, text analysis, or a recommendation engine) between literature resources. The secondary goal of the challenge is to structure the literature resources into a knowledge graph, such that resources can be queried and visualized according to their relationships.
Participant
This challenge requires you to build a recommendation engine which predicts a person’s rating for a specific product. Recommendation engines are used by many of our customers in the retail industry. The goal is to improve the experience of the customer, providing a list of articles the user might be interested in purchasing in a small area of the webpage. Recommendation engines are very common among customers within online retail. They use machine learning to make personalised product recommendations and offerings to consumers based on historical patterns and similarities among consumers and products.
The idea is to build a recommendation engine using machine learning at scale. The size of the dataset will make the complexity of the algorithm a key aspect of the challenge. The final model should predict a person’s/consumer’s rating for a specific product and be able to run on the whole dataset. The team will have access to several datasets from Amazon and will choose from the user rating dataset to the product metadata to train their models at scale.

Knowledge Engineering
The use of knowledge engineering could help the participants to reduce the complexity of the recommendation engine or to improve predictions. How can knowledge engineering be used at scale with Spark to work on big datasets efficiently? The solution should also find an answer to this question.
Databricks
Onremote uses video telephony with augmented reality and artificial intelligence to improve the first time fix rate on service tickets. This is done by creating a video and IoT based dataset that contains all the necessary information to solve the problem with remote support without being on-site. Furthermore, it is possible to evaluate exactly which materials and skills are required when on-site service is needed. This significantly increases the first time fix rate, as no empty runs are generated.
The challenge is to develop a user-friendly method to improve our image identification models to recognize device types from different manufacturers in order to increase the accuracy of service requests and to identify exactly who is responsible for them.
Onremote
The cost of acquiring new customers is much higher than the cost of retaining those already onboard, so reducing churn (change-turn) is a real financial priority. Everything possible should be done to keep current customers satisfied, maintaining their usage and increasing their lifetime value. Plus a high churn rate can also damage a product’s net promoter score.
With this challange you’ll identify reason for customer churn, create a model to predict churn and identify actions to take!
Baloise
×