top of page

PROJECT FEED

ALL projects are actively recruiting!

Spoken Language Processing Projects

Description:

2 projects involving medical issues: one on identifying whether a patient needs to be re-admitted to the hospital from text and speech of their subsequent discussions with practitioners and one on multiple types of speech and text in voices with different medical issues (e.g. schizophrenia, anxiety, PTSD and many more).

Topic(s):

Computer Science, Data Science & Math, Medicine/Clinical Research

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

Creative Machines Lab Projects

Description:

Host of ML projects: Smart Buildings: Work on applying RL to optimize HVAC systems of commercial buildings. In partnership with Google. See: https://arxiv.org/pdf/2410.03756 Meta Learning: Develop algorithms that “learn how to learn”, where various parts of the learning process are themselves learned in a bi-level optimization. AI For Biometrics: Develop AI algorithms for the next generation of biometric analysis. See https://www.science.org/doi/pdf/10.1126/sciadv.adi0329 AutoURDF: Unsupervised Robot Modeling from Point Cloud Videos AutoURDF is a pipeline that automatically generates URDF (Unified Robot Description Format) files from time-series 3D point cloud data. It segments moving parts, infers the robot’s kinematic topology, and estimates joint parameters, without any ground-truth annotations or manual intervention. This makes AutoURDF a scalable and fully visual solution for automated robot modeling. The first paper can be found here: https://openaccess.thecvf.com/content/CVPR2025/papers/Lin_AutoURDF_Unsupervised_Robot_Modeling_from_Point_Cloud_Frames_Using_Cluster_CVPR_2025_paper.pdf Generative Kinematics Synthesis: Image-based Kinematics Synthesis with Generative Models This project develops an image-based representation of the Planar Linkages dataset, covering a range of mechanisms from simple four-bar linkages to complex structures like the Jansen mechanism. Compared to previous studies, this dataset further includes crank-slider mechanisms, enabling broader exploration. A joint latent-space VAE model is used to investigate the potential of image generative models for simulating unseen kinematics and synthesizing novel motion trajectories. The same architecture also supports kinematic synthesis conditioned on both trajectory shape and velocity profiles. Preliminary results highlight the flexibility of image-based representations in generative mechanical design, showing that linkages, revolute and prismatic joints, and in future work, cams and gears can be unified under the same pixel-based encoding. Knolling: Design a robot that can look at a cluttered pile of Lego brick and sort them out neatly by color, shape and size. Use end-to-end ML. Foosball: Use reinforcement learning or other methods to train a robotic system to play on a foosball table. Start in simulation and continue in reality (physical system in development). 2D Shape vectorization: Go from a raster 2D shape (e.g. polygon) to a CSG tree of primitives (e.g. logic operations on simple shapes). If successful, apply in 3D. Supervisory ML: Explore whether a neural network (NN) can learn to determine whether another observed neural network is confident in its answers, simply by looking at some of the observed NN internal states. Self replicating NN: See if a NN can learn to output the value of its own weights (all of them) and at the same time also learn to perform some other task.

Topic(s):

Computer Science, Data Science & Math, Engineering

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

Clinical Research Informatics

Description:

We are an NIH-funded clinical research informatics lab with a focus on developing and evaluating innovative informatics interventions to support the life cycle of clinical research. Our research methods include knowledge representation, knowledge engineering, and knowledge-guided and data-centric Augmented Intelligence (AI).

Topic(s):

Computer Science, Data Science & Math, Natural & Physical Science

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

History Lab and LLMs

Description:

History Lab has more than 4 million declassified government documents. We are looking for 2 students on two separate but related projects involving LLMs. The first involves fine-tuning an LLM by incorporating our documents with the aim of allowing users to ask questions about and receive answers from our data. The second project would include training a language model to correct OCR errors in our documents. Students would need advanced Python skills and experience working with LLMs.

Topic(s):

Computer Science, Data Science & Math, Political/Legal, Humanities

Remote

Organization

Columbia University

Published Date

Sep 9, 2025

Machine Learning Downscaling of Earth System Models

Description:

Program Overview: We are developing an innovative program that generates high-resolution fire projections derived from Earth system model simulations through advanced machine learning downscaling techniques. Student Responsibilities: The student will collaborate with Dr. Erfani, a machine learning expert, to develop deep learning architectures for Earth system data downscaling using generative AI approaches such as denoising diffusion probabilistic models. A key advantage of diffusion models lies in their probabilistic nature, which enables uncertainty quantification and generation of multiple plausible scenarios rather than single deterministic outcomes. In our case, this technique offers the computational efficiency necessary to capture fine-scale spatial heterogeneity across large ensembles of plausible climate trajectories.

Topic(s):

Computer Science, Data Science & Math

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

Projects in Artificial Intelligence (AI)

Description:

Project Options AI for Cancer Detection Identifying Cancer Cells and Their Biomarker Expressions Cell quantitation techniques are used in biomedical research to diagnose and treat cancer. Current quantitation methods are subjective and based mostly on visual impressions of stained tissue samples. This time-consuming process causes delays in therapy that reduce the effectiveness of treatments and add to patient distress. Our lab is developing computational algorithms that use deep learning to model changes in protein structure from multispectral observations of tissue. Once computed, the model can be applied to any tissue observation to detect a variety of protein markers without further spectral analysis. The deep learning model will be quantitatively evaluated on a learning dataset of cancer tumors. AI for Neuroscience Deep Learning for Diagnosing and Treating Neurological Disorders Advances in biomedical research are based upon two foundations, preclinical studies using animal models, and clinical trials with human subjects. However, translation from basic animal research to treatment of human conditions is not straightforward. Preclinical studies in animals may not replicate across labs, and a multitude of preclinical leads have failed in human clinical trials. Inspired by recent generative models for semi-supervised action recognition and probabilistic 3D human motion prediction, we are developing a system that learns animal behavior from unstructured video frames without labels or annotations. Our approach extends a generative model to incorporate adversarial inference, and transformer-based self-attention modules. AI for Multimodal Data & Document Analysis Deciphering Findings from the Tulsa Race Massacre Death Investigation The Tulsa Race Massacre (1921) destroyed a flourishing Black community and left up to 300 people dead. More than 1000 homes were burned and destroyed. Efforts are underway to locate the bodies of victims and reconstruct lost historical information for their families. Collaborating with the Tulsa forensics team, we are developing spectral imaging methods (on-site) for deciphering information on eroded materials (stone engravings, rusted metal, and deteriorated wood markings), and a novel multimodal transformer network to associate recovered information on gravestones with death certificates and geographical information from public records. AI for Quantum Physics & Appearance Modeling Quantum Level Optical Interactions in Complex Materials The wavelength dependence of fluorescence is used in the physical sciences for material analysis and identification. However, fluorescent measurement techniques like mass spectrometry are expensive and often destructive. Empirical measurement systems effectively simulate material appearance but are time consuming, requiring densely sampled measurements. Leveraging GPU processing and shared super computing resources, we develop deep learning models that incorporate principles from quantum mechanics theory to solve large scale many-body problems in physics for non-invasive identification of complex proteinaceous materials.

Topic(s):

Computer Science, Data Science & Math, Natural & Physical Science

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

Biology of Aging

Description:

Students would receive training on the biology of aging as well as experimental biology techniques that are relevant to the project. Students would be responsible for contributing to these projects and coordinating activities and data analysis with other members of the laboratory.

Topic(s):

Data Science & Math, Natural & Physical Science

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

Design of 3D/2D UI, AR/VR, Mobile and Wearable Systems

Description:

The Computer Graphics and User Interfaces Lab (Prof. Feiner, PI) does research in the design of 3D and 2D user interfaces, including augmented reality and virtual reality, and mobile and wearable systems, for people interacting individually and together, indoors and outdoors. We use a range of displays and devices: head-worn, hand-held, and table-top, including Varjo XR-3, HP Reverb G2 Omnicept, Meta Quest Pro/3, Valve Index, HoloLens 2, Magic Leap 2, Xreal Light, Snap Next-Generation Spectacles, 3D Systems Touch, and phones. Multidisciplinary projects potentially involve working with faculty and students in other schools and departments, from medicine and dentistry to earth and environmental sciences and social work.

Topic(s):

Computer Science, Engineering

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

New York Genome Center: Forensic Genetics, Wet Lab Researcher

Description:

This wet lab research opportunity is part of innovative forensic genetics projects at the New York Genome Center. The projects are genetics for human rights initiatives, investigating the potential for biological encryption as well as investigating the permeation of DNA into soil at gravesites. The outcome of these works will be relevant to medical genomics, forensic science, archaeology, and global humanitarian efforts. This position would entail working in the wet lab, performing enzymatic assays, quantitative PCR, DNA extraction, and genomic library preparation.

Topic(s):

Natural & Physical Science

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

ML, Kalman Filtering, Transfer Learning on Smaller Datasets

Description:

The student(s) will be refining an existing ML model, implementing Kalman filtering, and trying out transfer learning on smaller datasets.

Topic(s):

Computer Science, Data Science & Math

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

Research Assistant in Neurology

Description:

This volunteer research assistant program offers students a unique opportunity to engage in neuroscience research focused on understanding cerebellar circuitry and its role in movement disorders. Students will work closely with the research team to acquire and analyze human physiological data, contributing to studies aimed at advancing knowledge of motor and cognitive functions. Student Responsibilities: -Assist with the acquisition of human physiological data (electroencephalography, EEG) during research sessions -Perform data preprocessing, organization, and analysis using relevant software tools -Support experimental setup and ensure proper functioning of equipment -Collaborate with team members to interpret data and contribute to research discussions -Maintain detailed and accurate records of experiments and results -Participate in regular team meetings and contribute ideas for improving research protocols

Topic(s):

Natural & Physical Science, Medicine/Clinical Research

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

Database-Backed Web Interfaces for Data Interaction

Description:

Build database-backed web interfaces for data interaction. Publish as an open source repository. It looks like this: https://columbia-desdr.github.io/desdr-ethiopia-demo/config

Topic(s):

Computer Science, Engineering

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

Safety and Understanding of LLM

Description:

Provide an overview of the project and description of the student's responsibilities: We will work on safety and understanding of large language model. We are interested in LLM-based Agent, which includes application to embodied AI and task solving.

Topic(s):

Computer Science, Data Science & Math

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

Logic of Tissue Quality Control: Embryos as an Experimental Paradigm

Description:

How do our cells keep time – or pause it – to shape their physiology? The Aydogan Lab investigates the fundamental principles of biological time control in animal development, metabolism, and disease. We are particularly interested in understanding the utility of time control in tissue homeostasis using animal embryos as a model system. As such, this exciting opportunity involves working along expert scientists at the interface of embryonic biology, metabolism, and biological time control, using techniques ranging from cell and molecular biology to bioinformatics, from advanced super-resolution microscopy to machine learning approaches. When tissues detect defective cells, they face a dilemma: whether to help repair (cell homeostasis) or decide to eliminate them (cell death). The same quandary occurs at sub-cellular levels too: when cells sense a defect within, they face the difficult choice of either resolving the defect or sacrificing themselves altogether. Decades of work helped reveal cellular mechanisms of homeostasis (e.g., ISR, UPR, DDR, and on) and death (apoptosis, ferroptosis, cell extrusion, and on), yet the molecular justification and decision-making mechanisms that help choose between these two functionally conflicting phenomena are slowly being unearthed. Conflicting indeed, as while the former choice helps a cell survive, the latter benefits the tissue at large by containing the defect locally to avoid its spread. In rarer occasions, even at the face of extreme stressors and damages, cells don’t display hallmarks of death and continue to live on (e.g. polyploidy or suspended animation). How could cells decide between these vital choices? We rationalize that, at least in part, time is of the essence. When a cellular damage arises, how long could the cell or tissue persist by repairing it? How does a cell decide to slow, speed up or arrest itself to resolve the damages? Does this choice differ when in the cell or tissue’s life cycle the damage occurs? For instance, does it matter that the damage occurs in rapidly dividing cells in a developing tissue versus in post-mitotic cells in an adult organ? Does it make a difference that the same type of damage occurs at different phases of the cell’s division cycle? When and how does a cell or tissue switch from repairing the damages to committing death? Does the cell or tissue age make a difference in the choice of homeostasis versus death? Or, how quickly to go from the former to the latter? All these questions boil down to decision-making mechanisms that rely on measuring the duration of damages or on monitoring the timing and amplitude of their consequences. As such, we conjecture that unravelling the molecular mechanisms and logic of time control in tissue quality maintenance promises the next-generation therapeutic approaches to treating tissue malformations, damages and disorders. In the Aydogan Lab, we use embryonic quality control as a model paradigm to address these questions. Your role will be to assist an on-going project in this program of the Aydogan to help your day-to-day mentor (graduate student, postdoc or staff associate). You are expected to commit at least 12-15 hours a week with consistent commitment. In turn, you will gain research experience and partake in scholarly endeavors (such as data collection, analysis, interpretation, write up, and potentially even becoming an author on a paper).

Topic(s):

Computer Science, Data Science & Math, Natural & Physical Science

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

“Pausing” Biological Time: Molecular Mechanisms, Biological Functions, and

Description:

How do our cells keep time – or pause it – to shape their physiology? The Aydogan Lab investigates the fundamental principles of biological time control in animal development, metabolism, and disease. We study how biological time can be suspended, uncovering the molecular circuits and metabolic programs that underlie quiescence, senescence, and dormancy. As such, this exciting opportunity involves working along expert scientists at the interface of metabolism and biological time control, using techniques ranging from cell and molecular biology to bioinformatics, from advanced super-resolution microscopy to machine learning approaches. Much of the collective understanding on biological time control has historically shaped around the progression of time (e.g. on cell cycle, circadian clock, segmentation clock, and on). Considerably less effort is devoted to research on the suspension of biological time such as cellular arrest, quiescence, senescence, dormancy, and suspended animation. Unravelling the molecular circuits and metabolic sustenance programs that underlie these cellular states holds significant biomedical promise, as they are implicated in phenomena as diverse as aging (e.g. senescence), cancer progression (e.g. post-metastatic dormancy), stress adaptation (e.g. suspended animation), and shaping tissue architecture (e.g. the quiescence in polyploidy). Understanding how biological time can be “paused” will also open the doors for investigating how cells decide between progressing vs. suspending their biological time. Your role will be to assist an on-going project in this program of the Aydogan to help your day-to-day mentor (graduate student, postdoc or staff associate). You are expected to commit at least 10 hours a week with consistent commitment. In turn, you will gain research experience and partake in scholarly endeavors (such as data collection, analysis, interpretation, write up, and potentially even becoming an author on a paper).

Topic(s):

Data Science & Math, Natural & Physical Science, Medicine/Clinical Research

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

Iot Systems Projects

Description:

We design solutions and implement systems to make IoT systems reliable in high heterogeneity, manageable and programmable at large scale. Students will be mentored by a PhD student to work on one of the projects: (1) Identity-independent IoT policy server. We build a policy server that focuses on parsing relationships instead of identities and evaluating upcoming requests against policies (e.g., security or energy related policies). We will study policy specifications, implement some new features based on an existing prototype, and evaluate the system using IoT datasets. (2) Distributed IoT device discovery and authorization. We will explore solutions based on mDNS and capability tokens. (3) Firewall solutions tailored to IoT. We will go through some quick tutorials to grasp a programming language for programmable networks and build a firewall prototype based on it. The firewall will introduce manufacture-specified profiles and dynamically capture network traffic. (4) Usability study for IoT device interoperability. We will study new IoT protocols for interoperation between different IoT devices and platforms.

Topic(s):

Computer Science

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

Database Management System Projects

Description:

Our group will be offering two research projects. The first project involves extending a database management system with features to allow matrix and matrix-like operators to be used alongside traditional relational algebra query operators. These matrix operators could potentially help support machine learning within the DBMS. The second project involves using novel processor-in-memory technologies to implement database query algorithms.

Topic(s):

Computer Science

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

Hidden Rhythms of the Cell: Role of Autonomous Cellular Clocks

Description:

How do our cells keep time – or pause it – to shape their physiology? The Aydogan Lab investigates the fundamental principles of biological time control in animal development, metabolism, and disease. They particularly focus on the emerging concept of autonomous clocks – subcellular timing mechanisms that are typically synchronized with major rhythmic programs such as the cell’s division cycle or its circadian clock, but which can also run independently to govern key events in cellular physiology. As such, this exciting opportunity involves working along expert scientists on biological time control, using techniques ranging from cell and molecular biology to bioinformatics, from advanced super-resolution microscopy to machine learning approaches. Our knowledge of cellular timing has traditionally revolved around the cell cycle, with CDK/Cyclin complexes serving as a master clock for the cell. Recent discoveries from the Aydogan Lab and others challenge this notion, revealing the existence of ‘autonomous clocks’ – timing mechanisms that are typically coupled by CDK/Cyclins to synchronize with nuclear divisions, but can also function autonomously with distinct timekeeping roles in physiology. Despite their emerging significance in cancers and tissue physiology, the fundamental mechanisms and design principles that operate autonomous clocks remain largely elusive. We still do not know the molecular underpinnings of how these clocks can couple with CDK/Cyclins during the cell cycle. Similarly, we have little information about the signaling mechanisms that help leverage their autonomy to act as quality control mechanisms, whether as fail-safe operators when the cell cycle halts erroneously, or as pacemakers when the cell cycle is silenced post-mitosis. Leveraging a wide of variety of techniques and model systems, the Aydogan Lab strives to decode the principles of autonomous clocks and shed light on their relevance in physiology. Your role will be to assist an on-going project in our lab to help your day-to-day mentor (graduate student, postdoc or staff associate). You are expected to commit at least 10 hours a week with consistent commitment. In turn, you will gain research experience and partake in scholarly endeavors (such as data collection, analysis, interpretation, write up, and potentially even becoming an author on a paper).

Topic(s):

Computer Science, Engineering, Natural & Physical Science

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

Measuring Volcanic Cone Erosion Using Aerial Phtogrammetry

Description:

Okmok volcano in the Aleutian Islands had a large eruption in 2008, which created a new volcanic cone. Since then, the cone has been eroding rapidly. We are trying to quantify this erosion, and we want to do this using sets of photos we collected from a helicopter in 2021, 2022, and 2023. We will create three-dimensional digital topography maps of the volcanic cone that formed during the 2008 eruption. The student will use the technique of Structure from Motion and photogrammetry, and tools such as GIS, to create the maps.

Topic(s):

Engineering, Natural & Physical Science

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

Uncovering Neural Circuit Mechanisms of Sensory Learning in Fruit Flies

Description:

Learning typically involves multiple senses; in humans and other animals, multisensory integration can enhance memory formation. The goal of this project is to uncover neural circuit mechanisms of multisensory learning in the fruit fly Drosophila melanogaster. The student will be a part of a collaborative team that includes behavioral studies, neural imaging and modeling. The student will be responsible for running the behavioral assays, including data collection and analysis. In addition, the student is expected to help with general lab upkeep and flywork. The student will also be expected to meet regularly with the research mentor and present their work at least once a year during lab meeting. Prior lab experience is not required—what matters most is curiosity, reliability, and a strong willingness to learn. Training and mentorship are central parts of this position.

Topic(s):

Natural & Physical Science

Hybrid

Organization

Columbia University

Published Date

Sep 9, 2025

More Projects Coming Soon

ProjectPie

bottom of page