Predicting Protein Allergens Accurately

The AllerCatPro analysis workflow helps researchers predict the likelihood of a protein producing an allergic response in people.

nanoantennas, Android malware detection, influenza virus, organ development, Internet of Things, allergy, machine learning, flaviviruses , flavivirus diagnosis, laser confocal sensor, autoimmune diseases, mosquito-borne viruses

Be it a stuffy nose or a bad rash, symptoms caused by allergies can make one feel miserable. Despite numerous studies, understanding why some people develop allergies remains complex, and so is accurately predicting what possibly could be an allergen in food or personal care products.

Furthermore, as databases of known allergens grow, existing predictive methods become more prone to falsely identifying non-allergenic proteins as potential allergens, said Sebastian Maurer-Stroh at A*STAR’s Bioinformatics Institute (BII). “As an example, some older methods would suggest that 90 percent of our own human proteins should be classified as allergens,” he explained.

Hence, his team, in collaboration with G. F. Gerberick and Nora Krutz from consumer goods company Procter & Gamble (P&G), sought to develop a method that can predict the allergenic potential of a protein with higher accuracy. The researchers first analyzed five major databases of known allergens to build a library of 4,180 proteins associated with allergy. Using this library, they were able to explore and refine the criteria for predicting protein allergenicity.

Traditionally, if a queried protein contains a six-amino-acid sequence that matches with a known allergen, it would have already been classified with allergenic potential. However, the field is well aware that this criterion may not be stringent enough, resulting in many false positives.

The new method, called AllerCatPro, thus implements a new criterion from information theory: low complexity sequences commonly found in many proteins are first ‘filtered out’ from the prediction workflow, thereby reducing random noise in downstream analyses. In addition, AllerCatPro compares the 3D structure of queried protein sequences against those in the aforementioned library to score for molecular shapes associated with allergenicity, therefore moving the previously linear window comparison into the more relevant 3D space.

“Our new method allows us to determine the allergenicity potential of proteins with a 37-fold increase in specificity, and with 100 percent sensitivity,” said Maurer-Stroh. The workflow of AllerCatPro has been published and applied to identify proteins with low allergenic potential. It is also used by P&G to conduct what is known as weight-of-evidence Type I allergy risk assessments on botanicals or natural ingredients that may contain proteins.

“Our joint interest is to make the method available widely to facilitate acceptance by the scientific community and regulators, as well as to gather feedback which will help us to continuously improve on the current method,” said Maurer-Stroh of the collaboration. AllerCatPro is also freely accessible on a web server.

Going forward, Maurer-Stroh and his team, together with A*STAR’s Innovations in Food and Chemical Safety (IFCS) program, plan to apply AllerCatPro to the safety assessment of proteins found in novel foods, such as those replacing meat with alternative protein sources. By getting regulatory bodies such as the Singapore Food Authority and companies in the food and nutrition sector on board, Maurer-Stroh and his team hope that AllerCatPro will contribute towards Singapore’s vision of ensuring national food security and safety.

The A*STAR-affiliated researchers contributing to this research are from the Bioinformatics Institute (BII), A*STAR.