Johns Hopkins University
The main goal of the CCVL (Computational Cognition, Vision, and Learning) research group is to develop mathematical models of vision and cognition. These models are intended primarily for designing artificial (computer) vision systems. Learning is required for extracting knowledge from data. Practical applications include vision for the disabled. These models also serve as computational models of biological vision which can be tested by behavioral methods and, in collaborative projects, with invasive, and non-invasive neuroscience techniques. We also study how humans and animals perform cognitive tasks such as learning and reasoning. In addition, we also use machine learning for interpreting medical images and studying brain function.
Advisor Profile
Professor Alan Yuille is a Bloomberg Distinguished Professor in the Department of Computer Science and the Department of Cognitive Science at Johns Hopkins. He published many influential papers in computer vision, cognitive science, etc. He has won the ICCV Marr Award and is an IEEE Fellow.
Lab Page:
Overall Information
We are seeking several summer research interns for 2024. The internship starts in May, and the duration is flexible (between 6 months to 1 year). Exceptional interns from previous years have been published as the first authors at top conferences in computer vision or medical image processing, such as CVPR, ICLR, and MICCAI. Priority will be given to exceptional interns for Ph.D. applications.
Research Directions
Our lab’s research lies in computer vision and machine learning. The detailed research groups include: 3D generative models, 3D datasets, Medical image analysis, Transformers, Vision and language, Embodied AI (mentored by Prof. Tianmin Shu).
The applicants are expected to fulfill one of the following group’s requirements. Besides, we would really appreciate it if you could specify which group you’re interested in when submitting your applications. We strongly enough you to read the related papers of our group and learn some preliminary knowledge by checking our publication list:
The requirements for different groups are as follows:
1. 3D generative models:
Basic: basic usage of PyTorch;
Basic: understanding of the 3D imaging (camera system);
Preferred: Publications (can be under review) on related topics, e.g., NeRF, 3D reconstruction, pose estimation, 3D detection, etc.
Understanding recent 3D vision or reconstruction techniques.
At least one of the following topics:
  • 3D from images (pose and shape, 3D detection)
  • Differentiable rendering (e.g., PyTorch3D, Gaussian Splatting)
  • Other 3D-related topics
2. 3D datasets:
Basic skills in using Python, PyTorch, and other machine-learning libraries;
Basic skills in using 3D tools, e.g., Blender.
3. Medical image analysis:
Proficiency in computer vision and image analysis concepts;
Proficiency in Python programming to use prevalent frameworks (such as nnU-Net and MONAI);
Prior experience with the analysis of radiological image datasets for AI applications is preferred;
Relevent publication/submission in conferences/journals (such as MICCAI, TMI, and MedIA) is preferred.
4. Transformers:
Basic skills in using Python, PyTorch, and other machine-learning libraries;
Basic mathematics foundations in related areas, e.g., statistical learning and optimization;
Knowledge of the basic concepts of the Transformers architectures;
Publications or submissions in related conferences and journals, e.g., CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, TPAMI, IJCV, JMLR.
5. Vision and language:
Proficiency in using Python, PyTorch, and other machine-learning libraries;
Basic knowledge in common deep learning methods in image understanding, language modeling, and multimodal learning (E.g. CNN, LSTM, Transformer);
Understanding the concepts of generative learning and the attention mechanism with transformers;
Hands-on experience with the vision-language model or large language model (e.g., CLIP, GPT, LLAMA, BLIP, Flamingo, StableDiffusion…);
Publications or submissions in related conferences or journals, e.g., CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, TPAMI, IJCV, TMLR.
6. Embodied AI (mentored by Prof. Tianmin Shu):
Basic skills in using Python, PyTorch, and other machine-learning libraries;
Experience or interests in the following topics:
  • Generative AI for developing embodied simulators with diverse and realistic human behaviors, including but not limited to synthesizing human-object interactions in household environments, human-vehicle interactions, and physically grounded social interactions.
  • Multimodal theory of mind reasoning for embodied agents.Embodied human-AI cooperation and communication.
How to Apply
If interested, please email Professor Alan Yuille ([email protected]) with your resume attached. Interns will collaborate with Professor Alan Yuille, Professor Tianmin Shu, and their research teams at Johns Hopkins.

