Yury Kostyukevich obtained Ph.D degree in Chemical Physics studding in Moscow Institute for Physics and Technology and defending thesis in N.N. Semenov Institute of Chemical Physics. Yury joined Skoltech in 2014. His research interests include high resolution mass spectrometry, analysis of complex natural mixtures, proteomics, metabolomics, gas phase ion chemistry, instrumentation development and supercomputer simulation of ion optics.
Yury was promoted to Assistant Professor position in 2019. Currently Yury is leading a team developing a technological platform for drug discovery. The research is supported by 5,000,000 RUB/year grant from Russian Science Foundation. This is a data intensive research and includes developing of novel approaches (both experimental and computational) to the measurements of molecular descriptors for all compounds simultaneously, data processing and database search. We have shown that our approach allows up to 10 times increase of the reliability of the identification of drugs.
Yury published more than 100 papers and 4 patents, he is principal investigator of several grants supported by Russian Foundation for Basic Research and Russian Science Foundation. His H-index is 23.
Screening for toxins, drugs, poisonous compounds is extremely important in many areas including homeland security, forensic science, food industry etc. Mass spectrometry combined to liquid or gas chromatography is the major instrument for such studies. However, important limitation of the classical approach is the insufficient size of the corresponding reference databases of LC-MS/MS fingerprints for standard compounds. The project is dedicated to the use of artificial intelligence for predicting physical and chemical properties of chemical compounds which can be used for identification.
Surgical resection, along with subsequent chemotherapy and radiation are the primary way of cancer caring. The correct choice of tumor resection margin has significant implications for patients having surgical resection: the choice between saving enough healthy tissue and removing all tumor cells is always intraoperative by specially trained histologist performing frozen tissue sections light microscopy analysis.
Despite widespread use, histological imaging has some significant disadvantages: it is subjective and directly depends on histologist experience , which requires a long time that prolongs the patient’s anesthesia period , and is limited with a certain number of sampling points available. For these reasons, this procedure might be unreliable for up to 30% of patients who underwent surgical resection. 
Mass spectrometry imaging (MSI) allows to directly determine the spatial distribution of ions with a high resolution by measuring the number of ions with a specific mass to charge ratio (m/z) for every microscopic area of the sample. In comparison to light microscopic analysis, the MSI approach is not limited to 2-3 dyes available for clinical histologists and might provide rich data of thousands of measured sample ions.Such data contain complete information about the ion distributions in the studied tissue samples and open up a wide field for its analysis and study using machine learning methods, opening a new way of defining a tumor resection margin.
This project proposes a novel scheme of intraoperative prediction of a tumor resection border using a currently developing machine learning-based computational tool for intraoperative tumor resection margin prediction using tissue sections and obtained mass spectrometry imaging data.
Fig.1 Possible intraoperative procedure that includes proposed MSI step
After receiving the sample, it is proposed to place the sample in a mass spectrograph of sufficient resolution, according to the data from which it would be possible to identify the affected tissue cells with adequate reliability. After that, we can use this data to predict the optimal boundary along which the surgeon will remove the tumor using a designed deep learning model. Such technology will be a reliable help for histologists, providing them with an independent view from the outside, and will also increase the effectiveness of treatment and, possibly, will speed up the overall process.
In more details, in collaboration with Burdenko Research Institute, we will construct new datasets consist of mass spectrometry imaging data of tumor affected tissues and related histological ground truth references and use it for hypothesis testing and development of deep learning architectures and training algorithms, providing an efficient and reliable way of tumor margin resection prediction.
Existing solutions for intraoperative tumor margin prediction that based on MS technologies such as Smart Knife (iKnife) , desorption electrospray ionization (DESI) , Picosecond Infrared Laser (PIRL) , and MasSpec Pen  differ in invasiveness, speed, spatial resolution, and user skill requirements. However, all of them are still at the research/prototype and clinical testing stages, and the problem remains open and relevant.
OMICs study can be defined as a study of the totality of something. Proteomics, lipidomics and metabolomics operate with big molecular data that is typically collected from mass spectrometry analysis. Molecular data is organized as a table that contains molecular features, such as m/z (mass-to-charge ratio), retention times, and m/z for fragments of the molecule.Correct interpretation of this data requires unequivocal identification of molecular structures from mass-spectrometry data. The most common approach is to perform database search on this features. However, there is a lack of experimental database information, for example MZCloud database of fragment mass spectra covers about 105 of molecules, while PubChem database of all synthesized molecules counts more than 107, and the whole chemical space was estimated as 1060. That is why computational approaches are desired to enhance identification process.
The ultimate goal of the project is to produce both experimental and data processing workflow that allows discriminating all molecular species in complex biological samplesin mass spectrometry based OMICs studies.
Experimental part assumes adding new features that can be collected using mass spectrometry. In particular, we will implement previously developed approach based on isotope exchange reactions for identification of illicit drugs and environmental pollutants. For that purpose we will establish experimental conditions and design an ion source for on-fly isotope labelling.
Computational part will include development of the data processing approach for isotope exchange experiments to assign mass spectral results with certain parts of a molecule.
We will also use common features (retention times, fragment spectra) for identification. Deep learning will be used to predict these features from molecular structures and to assign fragmentation data with molecular structures.
Figure 1. A – General idea of the transfer learning approach. B – Illustration of the data augmentation method. C – Illustration of self-supervised learning on SMILES strings.
Many natural systems such as petroleum, humics substances, dissolved organic matter, archeological, paleontological objects are ultracomplex mixtures which passed a long period of transformation in the environment. Such samples consists of more than 100,000 individual molecules, what makes it essential to use a modern Big Data processing tools for the investigation of such objects.
Students will be working with ultrahigh resolution mass spectra of natural samples in order to develop tools for the automatic comparison and classification of samples. One of the goals of the project will be understanding of the nature and composition of ancient resins used by ancient Egyptians for mummification.
Y Kostyukevich, S Solovyov, A Kononikhin, I Popov, E Nikolaev The investigation of the bitumen from ancient Greek amphora using FT ICR MS, H/D exchange and novel spectrum reduction approach. Journal of Mass Spectrometry 51 (6), 430-436
Full list of publication is available here:
High resolution mass-spectrometry, metabolomics, imaging, natural compounds, gas phase ion chemistry, machine learning, structural proteomics, analysis of complex biochemical mixtures, supercomputer modeling of ion cloud behavior in accumulation and transportation of ions, novel approaches for structure elucidation, mass spectrometry in archeology and paleontology. Research is supported by RFBR and RSF grants.