Our research interests focuses on medical imaging, image and audio processing, digital humanities, and interpretable machine learning and the use of known operators.
Research projects
Image Analysis and Fusion
Learning Algorithms for Medical Big Data Analysis (LAMBDA)
Magnetic Resonance Imaging (MRI)
Speech Processing and Understanding
Development of a guideline for the three-dimensional non-destructive acquisition of manuscripts
Intelligent MR Diagnosis of the Liver by Linking Model and Data-driven Processes (iDELIVER)
Molecular Assessment of Signatures ChAracterizing the Remission of Arthritis
Improved dual energy imaging using machine learning
A multimodal approach for automatic generation of radiology reports using chest X-ray images, clinical free-text, and spoken commands.
Advancements in Artificial Intelligence (AI) methods have enabled thedevelopment of Large Language Models (LLMs) capable of generating informationfrom user instructions and supporting various tasks in education, research,healthcare, and others. AI has also impacted the field of medical imaging withseveral deep learning models capable of achieving expert-level performanceacross different tasks, e.g., detection, segmentation, and assisted clinicaldiagnosis. In addition, open-source Automatic Speech Recognition (ASR) systemscan be incorporated as modules in AI-based systems. This proposed fundedproject aims to combine LLMs, medical imaging, and speech recognition using AImethods to generate high-quality radiology reports from chest X-ray images.
Chest X-Rays (CXR) serve as crucial diagnostic tools for pulmonary and cardiothoracic diseases, generating millions of images daily, a number on the rise due to decreasing acquisition costs. However, there's a pronounced scarcity of radiologists to interpret these images. Traditionally, CXR research has centered on enhancing classification accuracy, often achieving state-of-the-art results. Despite progress, there remain rare and intricate findings challenging for both human radiologists and AI systems to diagnose. Our investigation focuses on leveraging self-supervised image-text models to enhance the classification and localization of diverse findings. These self-supervised models eliminate the need for annotations, enabling the Deep Learning system to effectively learn from extensive public and private datasets.
Bhandary Panambur, A., Yu, H., Bhat, S., Madhu, P., Bayer, S., & Maier, A. (2024). Attention-guided Erasing. In Andreas Maier, Thomas M. Deserno, Heinz Handels, Klaus Maier-Hein, Christoph Palm, Thomas Tolxdorff (Eds.), Bildverarbeitung für die Medizin 2024 (pp. 13-18). Erlangen, DE: Wiesbaden: Springer Vieweg.
Gallo-Aristizábal, J.D., Escobar-Grisales, D., Ríos-Urrego, C.D., Nöth, E., & Orozco-Arroyave, J.R. (2024). Automatic Classification of Parkinson’s Disease Using Wav2vec Embeddings at Phoneme, Syllable, and Word Levels. In Elmar Nöth, Aleš Horák, Petr Sojka (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 313-323). Brno, CZE: Springer Science and Business Media Deutschland GmbH.
Hernandez, A., Perez Toro, P.A., Arias-Vergara, T., Vasquez-Correa, J.C., Yang, S.H., Orozco-Arroyave, J.R., & Maier, A. (2024). Anonymizing Dysarthric Speech: Investigating the Effects of Voice Conversion on Pathological Information Preservation. In Elmar Nöth, Aleš Horák, Petr Sojka (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 149-160). Brno, CZE: Springer Science and Business Media Deutschland GmbH.
Khun Jush, F., Dueppenbecker, P.M., & Maier, A. (2024). Speed-of-Sound Mapping for Pulse-Echo Ultrasound Raw Data Using Linked-Autoencoders. In Andreas K. Maier, Julia A. Schnabel, Pallavi Tiwari, Oliver Stegle (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 103-114). Honolulu, HI, USA: Springer Science and Business Media Deutschland GmbH.
Lopez-Santander, D.A., David Rios-Urrego, C., Bergler, C., Nöth, E., & Orozco-Arroyave, J.R. (2024). Robust Classification of Parkinson’s Speech: an Approximation to a Scenario With Non-controlled Acoustic Conditions. In Elmar Nöth, Aleš Horák, Petr Sojka (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 252-262). Brno, CZE: Springer Science and Business Media Deutschland GmbH.
Perez Toro, P.A., Dineley, J., Kaczkowska, A., Conde, P., Zhang, Y., Matcham, F.,... Cummins, N. (2024). LONGITUDINAL MODELING OF DEPRESSION SHIFTS USING SPEECH AND LANGUAGE. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (pp. 12021-12025). Seoul, KOR: Institute of Electrical and Electronics Engineers Inc..
Rist, L., Homm, C., Lades, F., Hernandez, A.A., Sühling, M., Gudman Steuble Brandt, E.,... Taubmann, O. (2024). Pancreatic Vessel Landmark Detection in CT Angiography Using Prior Anatomical Knowledge. In Artificial Intelligence in Pancreatic Disease Detection and Diagnosis, and Personalized Incremental Learning in Medicine (pp. 45-54). Marrakesh, MA: Cham: Springer.
Thies, M., Wagner, F., Gu, M., Mei, S., Huang, Y., Pechmann, S.,... Maier, A. (2024). Exploring Epipolar Consistency Conditions for Rigid Motion Compensation in In-vivo X-ray Microscopy. In Andreas Maier, Thomas M. Deserno, Heinz Handels, Klaus Maier-Hein, Christoph Palm, Thomas Tolxdorff (Eds.), Bildverarbeitung für die Medizin 2024. BVM 2024 (pp. 211-216). Erlangen, DE: Wiesbaden: Springer Vieweg.
Arias Vergara, T., Londoño-Mora, E., Perez Toro, P.A., Schuster, M., Nöth, E., Orozco Arroyave, J.R., & Maier, A. (2023). Measuring Phonological Precision in Children with Cleft Lip and Palate. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 4638-4642). Dublin, IE: International Speech Communication Association.
Bachmaier, M., Rohleder, M., Swartman, B., Privalov, M., Maier, A., & Kunze, H. (2023). Robust Hough and Spatial-To-Angular Transform Based Rotation Estimation for Orthopedic X-Ray Images. In Hayit Greenspan, Hayit Greenspan, Anant Madabhushi, Parvin Mousavi, Septimiu Salcudean, James Duncan, Tanveer Syeda-Mahmood, Russell Taylor (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 446-455). Vancouver, BC, CAN: Springer Science and Business Media Deutschland GmbH.
Braun, F., Perez Toro, P.A., Bayerl, S.P., Pérez-Toro, P.A., Hönig, F., Lehfeld, H.,... Riedhammer, K. (2023). Classifying Dementia in the Presence of Depression: A Cross-Corpus Study. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 2308-2312). Dublin, IRL: International Speech Communication Association.
Gu, M., Thies, M., Wagner, F., Pechmann, S., Aust, O., Weidner, D.,... Maier, A. (2023). Cavity Segmentation in X-ray Microscopy Scans of Mouse Tibiae. In Thomas M. Deserno, Heinz Handels, Andreas Maier, Klaus Maier-Hein, Christoph Palm, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 254-259). Braunschweig, DEU: Springer Science and Business Media Deutschland GmbH.
Maul, N., Zinn, K., Wagner, F., Thies, M., Rohleder, M., Pfaff, L.,... Maier, A. (2023). Transient Hemodynamics Prediction Using an Efficient Octree-Based Deep Learning Model. In Alejandro Frangi, Marleen de Bruijne, Demian Wassermann, Nassir Navab (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 183-194). San Carlos de Bariloche, AR: Springer Science and Business Media Deutschland GmbH.
Perez Toro, P.A., Arias Vergara, T., Braun, F., Hönig, F., Tobón-Quintero, C.A., Aguillón, D.,... Orozco Arroyave, J.R. (2023). Automatic Assessment of Alzheimer's across Three Languages Using Speech and Language Features. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 1748-1752). Dublin, IRL, IE: International Speech Communication Association.
Weise, T., Maier, A., Demir, K., Perez Toro, P.A., Arias Vergara, T., Heismann, B.,... Yang, S.H. (2023). Impact of Including Pathological Speech in Pre-training on Pathology Detection. In Kamil Ekštein, František Pártl, Miloslav Konopík (Eds.), Text, Speech, and Dialogue (pp. 141-153). Pilsen, CZ: Cham: Springer.
Wilm, F., Fragoso, M., Bertram, C.A., Stathonikos, N., Öttl, M., Qiu, J.,... Aubreville, M. (2023). Multi-scanner Canine Cutaneous Squamous Cell Carcinoma Histopathology Dataset. In Thomas M. Deserno, Heinz Handels, Andreas Maier, Klaus Maier-Hein, Christoph Palm, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 206-211). Braunschweig, DE: Springer Science and Business Media Deutschland GmbH.
Wilm, F., Fragoso, M., Marzahl, C., Qiu, J., Puget, C., Diehl, L.,... Aubreville, M. (2023). Abstract: Pan-tumor CAnine CuTaneous Cancer Histology (CATCH) Dataset. In Thomas M. Deserno, Heinz Handels, Andreas Maier, Klaus Maier-Hein, Christoph Palm, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 205-). Braunschweig, DE: Springer Science and Business Media Deutschland GmbH.
Bayerl, S.P., Wagner, D., Nöth, E., Bocklet, T., & Riedhammer, K. (2022). The Influence of Dataset Partitioning on Dysfluency Detection Systems. In Petr Sojka, Aleš Horák, Ivan Kopeček, Karel Pala (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 423-436). Brno, CZ: Springer Science and Business Media Deutschland GmbH.
Bayerl, S.P., Wagner, D., Nöth, E., & Riedhammer, K. (2022). Detecting Dysfluencies in Stuttering Therapy Using wav2vec 2.0. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 2868-2872). Incheon, KR: International Speech Communication Association.
Christlein, V., Marthot-Santaniello, I., Mayr, M., Nicolaou, A., & Seuret, M. (2022). Writer Retrieval and Writer Identification in Greek Papyri. In Springer (Eds.), Intertwining Graphonomics with Human Movements (pp. 76-89). Las Palmas de Gran Canaria, ES.
El-Ghoussani, A., Rodriguez Salas, D., Seuret, M., & Maier, A. (2022). GAN-based Augmentation of Mammograms to Improve Breast Lesion Detection. In Klaus Maier-Hein, Thomas M. Deserno, Heinz Handels, Andreas Maier, Christoph Palm, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 321-326). Heidelberg, DEU: Springer Science and Business Media Deutschland GmbH.
Escobar-Grisales, D., Rios-Urrego, C.D., Gallo-Aristizabal, J.D., Lopez-Santander, D.A., Calvo-Ariza, N.R., Nöth, E., & Orozco-Arroyave, J.R. (2022). Colombian Dialect Recognition from Call-Center Conversations Using Fusion Strategies. In APPLIED COMPUTER SCIENCES IN ENGINEERING, WEA 2022 (pp. 54-65). Bogota, COLOMBIA: CHAM: SPRINGER INTERNATIONAL PUBLISHING AG.
Gu, M., Vesal, S., Kosti, R.V., & Maier, A. (2022). Few-shot Unsupervised Domain Adaptation for Multi-modal Cardiac Image Segmentation. In Klaus Maier-Hein, Thomas M. Deserno, Heinz Handels, Andreas Maier, Christoph Palm, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 20-25). Heidelberg, DEU: Springer Science and Business Media Deutschland GmbH.
Hernandez, A., Klumpp, P., Das, B.K., Maier, A., & Yang, S.H. (2022). Autoblog 2021: The Importance of Language Models for Spontaneous Lecture Speech. In Petr Sojka, Aleš Horák, Ivan Kopeček, Karel Pala (Eds.), Text, Speech, and Dialogue 25th International Conference, TSD 2022, Brno, Czech Republic, September 6–9, 2022, Proceedings (pp. 291-300). Brno, Czech Republic, CZ: Springer Nature Switzerland AG: Springer Cham.
Klumpp, P., Arias-Vergara, T., Perez-Toro, P.-A., Nöth, E., & Orozco-Arroyave, J.R. (2022). Common Phone: A Multilingual Dataset for Robust Acoustic Modelling. In LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (pp. 763-768). Marseille, FRANCE: PARIS: EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA.
Mattick, A., Mayr, M., Maier, A., & Christlein, V. (2022). Is Multitask Learning Always Better? In Seiichi Uchida, Elisa Barney, Véronique Eglin (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 674-687). La Rochelle, FR: Springer Science and Business Media Deutschland GmbH.
Mayr, M., Felker, A., Maier, A., & Christlein, V. (2022). Combining Visual and Linguistic Models for a Robust Recipient Line Recognition in Historical Documents. In Seiichi Uchida, Elisa Barney, Véronique Eglin (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 598-612). La Rochelle, FRA: Springer Science and Business Media Deutschland GmbH.
Melsheimer, B., Jahn, A., Putnings, M., Valianos, S., & Walther, M. (2022). Towards a CRIS-integrated solution for University Press workflows. In Proceedings of the CRIS2022: 15th International Conference on Current Research Information Systems. Dubrovnik, HR.
Perez Toro, P.A., Klumpp, P., Hernandez, A., Arias Vergara, T., Lillo, P., Slachevsky, A.,... Orozco Arroyave, J.R. (2022). Alzheimer's Detection from English to Spanish Using Acoustic and Linguistic Embeddings. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 2483-2487). Incheon, KR: International Speech Communication Association.
Pérez-Toro, P.A., Klumpp, P., Vasquez-Correa, J.C., Schuster, M., Nöth, E., Orozco-Arroyave, J.R., & Arias Vergara, T. (2022). 50 Shades of Gray: Effect of the Color Scale for the Assessment of Speech Disorders. In Petr Sojka, Aleš Horák, Ivan Kopeček, Karel Pala (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 352-363). Brno, CZE: Springer Science and Business Media Deutschland GmbH.
Rao, D., Maass, N., Dennerlein, F., Maier, A., & Huang, Y. (2022). Machine Learning-based Detection of Spherical Markers in CT Volumes. In Klaus Maier-Hein, Thomas M. Deserno, Heinz Handels, Andreas Maier, Christoph Palm, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 51-56). Heidelberg, DEU: Springer Science and Business Media Deutschland GmbH.
Rios-Urrego, C.D., Moreno-Acevedo, S.A., Nöth, E., & Orozco Arroyave, J.R. (2022). End-to-End Parkinson’s Disease Detection Using a Deep Convolutional Recurrent Network. In Petr Sojka, Aleš Horák, Ivan Kopeček, Karel Pala (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 326-338). Brno, CZ: Springer Science and Business Media Deutschland GmbH.
Sukesh, R., Seuret, M., Nicolaou, A., Mayr, M., & Christlein, V. (2022). A Fair Evaluation of Various Deep Learning-Based Document Image Binarization Approaches. In Seiichi Uchida, Elisa Barney, Véronique Eglin (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 771-785). La Rochelle, FRA: Springer Science and Business Media Deutschland GmbH.
Wilm, F., Marzahl, C., Breininger, K., & Aubreville, M. (2022). Domain Adversarial RetinaNet as a Reference Algorithm for the MItosis DOmain Generalization Challenge. In Marc Aubreville, David Zimmerer, Mattias Heinrich (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 5-13). Strasbourg, FRA: Springer Science and Business Media Deutschland GmbH.
Zippert, P., Seuret, M., Maier, A., & Hausotte, T. (2022). Influence of X-Ray Radiation on Historical Paper. In Proceedings of the 11th Conference on Industrial Computed Tomography. Wels, Austria, AT.
tom Dieck, T., Perez Toro, P.A., Arias Vergara, T., Nöth, E., & Klumpp, P. (2022). Wav2vec behind the Scenes: How end2end Models learn Phonetics. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 5130-5134). Incheon, KR: International Speech Communication Association.
tom Dieck, T., Perez Toro, P.A., Arias Vergara, T., Nöth, E., & Klumpp, P. (2022). Wav2vec behind the Scenes: How end2end Models learn Phonetics. In Proceedings of the 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022 (pp. 5130-5134). International Speech Communication Association.
Bertram, C.A., Donovan, T.A., Tecilla, M., Bartenschlager, F., Fragoso, M., Wilm, F.,... Aubreville, M. (2021). Dataset on Bi- and Multi-nucleated Tumor Cells in Canine Cutaneous Mast Cell Tumors. In Christoph Palm, Heinz Handels, Klaus Maier-Hein, Thomas M. Deserno, Andreas Maier, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 134-139). Regensburg, DE: Springer Science and Business Media Deutschland GmbH.
Chen, S., Stromer, D., Alnasser Alabdalrahim, H., Schwab, S., Weih, M., & Maier, A. (2021). Abstract: Automatic Dementia Screening and Scoring by Applying Deep Learning on Clock-drawing Tests. In Christoph Palm, Heinz Handels, Klaus Maier-Hein, Thomas M. Deserno, Andreas Maier, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 289-). Regensburg, DE: Springer Science and Business Media Deutschland GmbH.
Denzinger, F., Wels, M., Breininger, K., Gülsün, M.A., Schöbinger, M., André, F.,... Maier, A. (2021). Abstract: Automatic CAD-RADS Scoring using Deep Learning. In Christoph Palm, Heinz Handels, Klaus Maier-Hein, Thomas M. Deserno, Andreas Maier, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 104-). Regensburg, DEU: Springer Science and Business Media Deutschland GmbH.
Escobar-Grisales, D., Vasquez Correa, J., & Orozco Arroyave, J.R. (2021). Gender Recognition in Informal and Formal Language Scenarios via Transfer Learning. In Juan Carlos Figueroa-García, Yesid Díaz-Gutierrez, Elvis Eduardo Gaona-García, Alvaro David Orjuela-Cañón (Eds.), Communications in Computer and Information Science (pp. 171-179). Virtual, Online: Springer Science and Business Media Deutschland GmbH.
Felsner, L., Syben-Leisner, C., Maier, A., & Riess, C. (2021). Helical Dark-field Fiber Reconstruction. In Proceedings of the International Meeting on Fully Three-Dimensional Image Reconstruction in Radiology and Nuclear Medicine (Fully3D).
Fu, W., Mill, L., Seitz, S., Geimer, T., Kling, L., Possart, D.,... Maier, A. (2021). Towards Mouse Bone X-ray Microscopy Scan Simulation. In Christoph Palm, Heinz Handels, Klaus Maier-Hein, Thomas M. Deserno, Andreas Maier, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 128-133). Regensburg, DE: Springer Science and Business Media Deutschland GmbH.
Hoppe, E., Wetzl, J., Roser, P., Felsner, L., Preuhs, A., & Maier, A. (2021). 2D Respiration Navigation Framework for 3D Continuous Cardiac Magnetic Resonance Imaging. In Christoph Palm, Heinz Handels, Klaus Maier-Hein, Thomas M. Deserno, Andreas Maier, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 158-163). Regensburg, DE: Springer Science and Business Media Deutschland GmbH.
Martín Vicario, C., Kordon, F., Denzinger, F., Weiten, M., Thomas, S., Kausch, L.,... Kunze, H. (2021). Automatic Plane Adjustment in Surgical Cone Beam CT-volumes. In Christoph Palm, Heinz Handels, Klaus Maier-Hein, Thomas M. Deserno, Andreas Maier, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 170-). Regensburg, DE: Springer Science and Business Media Deutschland GmbH.
Marzahl, C., Aubreville, M., Bertram, C.A., Stayt, J., Jasensky, A.K., Bartenschlager, F.,... Maier, A. (2021). Abstract: Deep Learning-based Quantification of Pulmonary Hemosiderophages in Cytology Slides. In Christoph Palm, Heinz Handels, Klaus Maier-Hein, Thomas M. Deserno, Andreas Maier, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 48-). Regensburg, DE: Springer Science and Business Media Deutschland GmbH.
Marzahl, C., Bertram, C.A., Wilm, F., Voigt, J., Barton, A.K., Klopfleisch, R.,... Aubreville, M. (2021). Cell Detection for Asthma on Partially Annotated Whole Slide Images: Learning to be EXACT. In Christoph Palm, Heinz Handels, Klaus Maier-Hein, Thomas M. Deserno, Andreas Maier, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 147-152). Regensburg, DE: Springer Science and Business Media Deutschland GmbH.
Mattick, A., Mayr, M., Seuret, M., Maier, A., & Christlein, V. (2021). SmartPatch: Improving Handwritten Word Imitation with Patch Discriminators. In Josep Lladós, Daniel Lopresti, Seiichi Uchida (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 268-283). Lausanne, CHE: Springer Science and Business Media Deutschland GmbH.
Perez Toro, P.A., Vasquez Correa, J., Arias Vergara, T., Klumpp, P., Sierra-Castrillón, M., Roldán-López, M.E.,... Nöth, E. (2021). Acoustic and Linguistic Analyses to Assess Early-Onset and Genetic Alzheimer's Disease. In Proceedings of the ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 8338-8342). IEEE.
Perez Toro, P.A., Vásquez-Correa, J.C., Arias Vergara, T., Klumpp, P., Schuster, M., Nöth, E., & Orozco-Arroyave, J.R. (2021). Emotional State Modeling for the Assessment of Depression in Parkinson’s Disease. In Kamil Ekštein, František Pártl, Miloslav Konopík (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 457-468). Olomouc, CZE: Springer Science and Business Media Deutschland GmbH.
Reimann, M., Fu, W., & Maier, A. (2021). Novel Evaluation Metrics for Vascular Structure Segmentation. In Christoph Palm, Heinz Handels, Klaus Maier-Hein, Thomas M. Deserno, Andreas Maier, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 80-85). Regensburg, DE: Springer Science and Business Media Deutschland GmbH.
Rios-Urrego, C.D., Vásquez-Correa, J.C., Orozco-Arroyave, J.R., & Nöth, E. (2021). Is There Any Additional Information in a Neural Network Trained for Pathological Speech Classification? In Kamil Ekštein, František Pártl, Miloslav Konopík (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 435-447). Olomouc, CZ: Springer Science and Business Media Deutschland GmbH.
Roser, P., Felsner, L., Maier, A., & Riess, C. (2021). Learning the Inverse Weighted Radon Transform. In Christoph Palm, Heinz Handels, Klaus Maier-Hein, Thomas M. Deserno, Andreas Maier, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 49-54). Regensburg, DE: Springer Science and Business Media Deutschland GmbH.
Roser, P., Zhong, X., Birkhold, A., Preuhs, A., Syben-Leisner, C., Hoppe, E.,... Maier, A. (2021). Abstract: Simultaneous Estimation of X-ray Back-scatter and Forward-scatter using Multi-task Learning. In Christoph Palm, Heinz Handels, Klaus Maier-Hein, Thomas M. Deserno, Andreas Maier, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 262-). Regensburg, DE: Springer Science and Business Media Deutschland GmbH.
Seuret, M., Nicolaou, A., Rodríguez-Salas, D., Weichselbaumer, N., Stutzmann, D., Mayr, M.,... Christlein, V. (2021). ICDAR 2021 Competition on Historical Document Classification. In Josep Lladós, Daniel Lopresti, Seiichi Uchida (Eds.), Document Analysis and Recognition – ICDAR 2021 (pp. 618-634). Lausanne, CH: Springer Science and Business Media Deutschland GmbH.
Theelke, L., Wilm, F., Marzahl, C., Bertram, C.A., Klopfleisch, R., Maier, A.,... Breininger, K. (2021). Iterative Cross-Scanner Registration for Whole Slide Images. In 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021) (pp. 582-590). LOS ALAMITOS: IEEE COMPUTER SOC.
Tripathi, P., Obler, R., Maier, A., & Janssen, H. (2021). A Novel Trilateral Filter for Digital Subtraction Angiography. In Christoph Palm, Heinz Handels, Klaus Maier-Hein, Thomas M. Deserno, Andreas Maier, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 310-315). Regensburg, DE: Springer Science and Business Media Deutschland GmbH.
Wallraff, S., Vesal, S., Syben-Leisner, C., Lutz, R., & Maier, A. (2021). Age Estimation on Panoramic Dental X-ray Images using Deep Learning. In Christoph Palm, Heinz Handels, Klaus Maier-Hein, Thomas M. Deserno, Andreas Maier, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 186-191). Regensburg, DE: Springer Science and Business Media Deutschland GmbH.
Wilm, F., Bertram, C.A., Marzahl, C., Bartel, A., Donovan, T.A., Assenmacher, C.A.,... Aubreville, M. (2021). Influence of Inter-Annotator Variability on Automatic Mitotic Figure Assessment. In Christoph Palm, Heinz Handels, Klaus Maier-Hein, Thomas M. Deserno, Andreas Maier, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 241-246). Regensburg, DE: Springer Science and Business Media Deutschland GmbH.
Ahmed, J., Vesal, S., Durlak, F., Kaergel, R., Ravikumar, N., Rémy-Jardin, M., & Maier, A. (2020). COPD classification in CT images using a 3D convolutional neural network. In Thomas Tolxdorff, Thomas M. Deserno, Heinz Handels, Andreas Maier, Klaus H. Maier-Hein, Christoph Palm (Eds.), Informatik aktuell (pp. 39-45). Berlin, DE: Springer.
Argüello-Vélez, P., Arias-Vergara, T., González-Rátiva, M.C., Orozco-Arroyave, J.R., Nöth, E., & Schuster, M.E. (2020). Acoustic characteristics of vot in plosive consonants produced by parkinson’s patients. In Petr Sojka, Ivan Kopecek, Karel Pala, Aleš Horák (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 303-311). Brno, CZ: Springer Science and Business Media Deutschland GmbH.
Bergler, C., Schmitt, M., Maier, A., Smeele, S., Barth, V., & Nöth, E. (2020). ORCA-CLEAN: A Deep Denoising Toolkit for Killer Whale Communication. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 (pp. 1136-1140). Shanghai, China: International Speech Communication Association.
Bertram, C.A., Veta, M., Marzahl, C., Stathonikos, N., Maier, A., Klopfleisch, R., & Aubreville, M. (2020). Are Pathologist-Defined Labels Reproducible? Comparison of the TUPAC16 Mitotic Figure Dataset with an Alternative Set of Labels. In Jaime Cardoso, Wilson Silva, Ricardo Cruz, Hien Van Nguyen, Badri Roysam, Nicholas Heller, Pedro Henriques Abreu, Jose Pereira Amorim, Ivana Isgum, Vishal Patel, Kevin Zhou, Steve Jiang, Ngan Le, Khoa Luu, Raphael Sznitman, Veronika Cheplygina, Samaneh Abbasi, Diana Mateus, Emanuele Trucco (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 204-213). Lima, PE: Springer Science and Business Media Deutschland GmbH.
Bootwala, A., Breininger, K., Maier, A., & Christlein, V. (2020). Assistive Diagnosis in Opthalmology Using Deep Learning-Based Image Retrieval. In Thomas Tolxdorff; Thomas M. Deserno; Heinz Handels; Andreas Maier; Klaus H. Maier-Hein; Christoph Palm (Eds.), Bildverarbeitung für die Medizin 2020. (pp. 144-149). Wiesbaden: Springer Vieweg.
Breininger, K., Pfister, M., Kowarschik, M., & Maier, A. (2020). Move Over There: One-Click Deformation Correction for Image Fusion During Endovascular Aortic Repair. In Anne L. Martel, Purang Abolmaesumi, Danail Stoyanov, Diana Mateus, Maria A. Zuluaga, S. Kevin Zhou, Daniel Racoceanu, Leo Joskowicz (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 713-723). Lima, PE: Springer Science and Business Media Deutschland GmbH.
Felsner, L., Würfl, T., Syben-Leisner, C., Roser, P., Preuhs, A., Maier, A., & Riess, C. (2020). Reconstruction of Voxels with Position- and Angle-Dependent Weightings. In Proceedings of the The 6th International Conference on Image Formation in X-Ray Computed Tomography. Online meeting.
Hoßbach, J., Husvogt, L., Kraus, M., Fujimoto, J.G., & Maier, A. (2020). Deep OCT angiography image generation for motion artifact suppression. In Thomas Tolxdorff, Thomas M. Deserno, Heinz Handels, Andreas Maier, Klaus H. Maier-Hein, Christoph Palm (Eds.), Informatik aktuell (pp. 248-253). Berlin, DE: Springer.
Huang, Y., Gao, L., Preuhs, A., & Maier, A. (2020). Field of View Extension in Computed Tomography Using Deep Learning Prior. In Andreas Maier, Klaus Hermann Maier-Hein, Thomas Martin Deserno, Heinz Handels, Thomas Tolxdorff (Eds.), Bildverarbeitung für die Medizin: Algorithmen – Systeme – Anwendungen. Berlin, DE: Springer.
Klumpp, P., Arias Vergara, T., Vasquez Correa, J., Perez Toro, P.A., Hönig, F.T., Nöth, E., & Orozco-Arroyave, J.-R. (2020). Surgical mask detection with deep recurrent phonetic models. In Proceedings of the 21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 (pp. 2057-2061). International Speech Communication Association.
Klumpp, P., Arias Vergara, T., Vásquez-Correa, J.C., Pérez-Toro, P.A., Hönig, F.T., Nöth, E., & Orozco-Arroyave, J.R. (2020). Surgical mask detection with deep recurrent phonetic models. In Interspeech 2020. International Speech Communication Association (ISCA).
Kordon, F., Fischer, P., Privalov, M., Swartman, B., Schnetzke, M., Franke, J.,... Kunze, H. (2020, February). Multi-Task Framework for X-Ray Guided Planning in Knee Surgery. Poster presentation at Bildverarbeitung für die Medizin 2020, Berlin, DE.
Marzahl, C., Bertram, C.A., Aubreville, M., Petrick, A., Weiler, K., Gläsel, A.C.,... Maier, A. (2020). Are fast labeling methods reliable? a case study of computer-aided expert annotations on microscopy slides. In Anne L. Martel, Purang Abolmaesumi, Danail Stoyanov, Diana Mateus, Maria A. Zuluaga, S. Kevin Zhou, Daniel Racoceanu, Leo Joskowicz (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 24-32). Lima, PE: Springer Science and Business Media Deutschland GmbH.
Mayr, M., Stumpf, M., Nicolaou, A., Seuret, M., Maier, A., & Christlein, V. (2020). Spatio-Temporal Handwriting Imitation. In Springer, Cham (Eds.), Proceedings of the European Conference on Computer Vision (pp. 528-543). Online.
Preuhs, A., Manhart, M., Roser, P., Stimpel, B., Syben-Leisner, C., Psychogios, M.,... Maier, A. (2020). Deep autofocus with cone-beam CT consistency constraint. In Proceedings of the Bildverarbeitung für die Medizin. Berlin, DE.
Restrepo-Uribe, J.P., Roldan-Vasco, S., Perez-Giraldo, E., Orozco Arroyave, J.R., & Orozco-Duque, A. (2020). Electrophysiological and Mechanical Approaches to the Swallowing Analysis. In Juan Carlos Figueroa-García, Fabián Steven Garay-Rairán, Germán Jairo Hernández-Pérez, Yesid Díaz-Gutierrez (Eds.), Communications in Computer and Information Science (pp. 281-290). Bogota, CO: Springer Science and Business Media Deutschland GmbH.
Reymann, M., Massanes, F., Ritt, P., Cachovan, M., Kuwert, T., Vija, A.H., & Maier, A. (2020, November). Feature Loss After Denoising of SPECT Projection Data using a U-Net. Poster presentation at 2020 IEEE Nuclear Science Symposium and Medical Imaging Conference, Boston, Massachusetts, US.
Roser, P., Zhong, X., Birkhold, A., Preuhs, A., Syben-Leisner, C., Hoppe, E.,... Maier, A. (2020). Simultaneous Estimation of X-Ray Back-Scatter and Forward-Scatter Using Multi-task Learning. In Anne L. Marte, lPurang Abolmaesumi, Danail Stoyanov, Diana Mateus, Maria A. Zuluaga, S. Kevin Zhou, Daniel Racoceanu, Leo Joskowicz (Eds.), Medical Image Computing and Computer Assisted Intervention – MICCAI 2020. (pp. 199-208).
Schaffert, R., Wang, J., Fischer, P., Borsdorf, A., & Maier, A. (2020). Learning-based misalignment detection for 2-D/3-D overlays. In Thomas Tolxdorff, Thomas M. Deserno, Heinz Handels, Andreas Maier, Klaus H. Maier-Hein, Christoph Palm (Eds.), Informatik aktuell (pp. 230-235). Berlin, DE: Springer.
Schaffert, R., Weiß, M., Wang, J., Borsdorf, A., & Maier, A. (2020). Learning-based correspondence estimation for 2-D/3-D registration. In Thomas Tolxdorff, Thomas M. Deserno, Heinz Handels, Andreas Maier, Klaus H. Maier-Hein, Christoph Palm (Eds.), Informatik aktuell (pp. 222-228). Berlin, DEU: Springer.
Schirrmacher, F., Lorch, B., Stimpel, B., Köhler, T., & Riess, C. (2020). SR²: Super-Resolution With Structure-Aware Reconstruction. In Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP) (pp. 533-537). Online.
Schuller, B., Batliner, A., Bergler, C., Messner, E.-M., Hamilton, A., Amiriparian, S.,... Hantke, S. (2020). The INTERSPEECH 2020 Computational Paralinguistics Challenge: Elderly Emotion, Breathing & Masks. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 (pp. 2042-2046). Shanghai, China: International Speech Communication Association.
Our research interests focuses on medical imaging, image and audio processing, digital humanities, and interpretable machine learning and the use of known operators.
Research projects
A multimodal approach for automatic generation of radiology reports using chest X-ray images, clinical free-text, and spoken commands.
(FAU Funds)
Advancements in Artificial Intelligence (AI) methods have enabled thedevelopment of Large Language Models (LLMs) capable of generating informationfrom user instructions and supporting various tasks in education, research,healthcare, and others. AI has also impacted the field of medical imaging withseveral deep learning models capable of achieving expert-level performanceacross different tasks, e.g., detection, segmentation, and assisted clinicaldiagnosis. In addition, open-source Automatic Speech Recognition (ASR) systemscan be incorporated as modules in AI-based systems. This proposed fundedproject aims to combine LLMs, medical imaging, and speech recognition using AImethods to generate high-quality radiology reports from chest X-ray images.
Self-Supervised Learning on Chest X-Rays to improve classification and localization
(Non-FAU Project)
Chest X-Rays (CXR) serve as crucial diagnostic tools for pulmonary and cardiothoracic diseases, generating millions of images daily, a number on the rise due to decreasing acquisition costs. However, there's a pronounced scarcity of radiologists to interpret these images. Traditionally, CXR research has centered on enhancing classification accuracy, often achieving state-of-the-art results. Despite progress, there remain rare and intricate findings challenging for both human radiologists and AI systems to diagnose. Our investigation focuses on leveraging self-supervised image-text models to enhance the classification and localization of diverse findings. These self-supervised models eliminate the need for annotations, enabling the Deep Learning system to effectively learn from extensive public and private datasets.
2025
2024
2023
2022
2021
2020
Related Research Fields
Contact: