Real Problems, Real Teamwork

March 7, 2017

Data+: Duke’s award-winning approach to education for the data age

Kyle Bradbury and Data+ students

In ongoing interdisciplinary research led by Leslie Collins of Duke ECE, Kyle Bradbury of Duke’s Energy Data Analytics Lab and Tim Johnson of the Nicholas School of the Environment, Duke undergraduate teams use machine learning to assess U.S. solar capacity and energy consumption—providing valuable data to inform smart grid infrastructure planning.

Duke’s nationally recognized Data+ program brings teams of faculty and students together each summer to explore data-driven approaches to real-world problems—including many from clients such as Accenture, Fidelity Charitable and Duke Health. Impactful projects like the ones below will soon become part of every Duke Engineering undergraduate’s experience with the debut of an innovative new Data Science course.

Solar Counts

With enough solar installations to power 6 million homes, the U.S. energy infrastructure is rapidly evolving. But while individual states track solar locations and capacities on a broad scale, there is little information about exactly where solar energy is emerging on a city or neighborhood level.

With this more precise data, officials could predict where to install new technology to meet changing demands, social scientists could better understand how policy affects solar energy adoption and economists could better value the future of the 8,000 solar companies employing more than 200,000 American workers.


Data & the Duke Engineer - Read more about Duke Engineering's ambitious initiatives in data science education and researchIt’s a gigantic undertaking of great importance, but Duke has just the people for the job—our undergraduates, who recently began figuring out how to tally solar capacity from satellite imagery through Duke’s Data+ Program.

A small group of four students, mentored by graduate students and faculty from Duke Engineering and Duke’s Energy Initiative, spent the summer building a dataset by meticulously annotating 58 square miles of satellite imagery of Fresno, California. The team then coded their own proof-of-principle machine learning algorithm that was able to identify solar panels with over 90 percent accuracy.

The group then passed the baton to Bass Connections—a program at Duke supporting interdisciplinary collaborations between faculty and students. Led by the same faculty, undergraduates expanded the dataset to include three more cities in California, annotating 20,000 individual solar panels from 1.5 billion pixels of satellite imagery from the U.S. Geological Survey.

The students recently published the massive dataset in an open, online journal, providing other researchers worldwide with ground-truth data that could help program algorithms to spot all types of objects from the sky.

“We get a lot of amazing students who come in wanting to change the world. With these data-focused programs, it’s not just about the academics of data analysis. They’re getting to work on actual challenges being faced by industries and corporations on a daily basis, and that is a valuable, rewarding and motivational experience.”

Leslie CollinsLeslie Collins
Professor, ECE
Member, Duke Energy Initiative

parking mapHow can we optimize parking on a crowded campus?

Working with Duke Parking and Transportation, a student team examined parking patterns across the campus and built an interactive “redirection” tool that could help students and employees figure out the best place to park if their preferred lot is full. The partners are now discussing ways to operationalize the tool at Duke.

“Data+ was immensely beneficial for me,” said participant Mitchell Parekh, ECE’19. “I gained an understanding of what big data really is, along with the amount of work required to extract usable information from it.”

LungMAPCan we build a better mouse map?

BME major Pablo Ortiz and team worked with Duke’s Department of Biostatistics and Bioinformatics to develop a tool that helps researchers more effectively utilize LungMAP, an open-access database of images of developing mouse lungs. They built an image segmentation pipeline to allow biologists and clinical researchers to quantify changes in lung structure during fetal development, and improve understanding of normal lung structure and function.

clinic placement mapWhat’s the best location for Zika vaccination clinics?

Lindsay Hirschhorn ME’19 was part of a team charged with determining optimal vaccination clinic locations in Durham County for a simulated Zika virus outbreak. Working with researchers at RTI International to construct models of disease spread and health impact, the team developed an interactive visualization tool to show results

“Duke has done a tremendous job in creating a learning environment that promotes collaboration, drives innovation and creative problem-solving, and provides enough structure and mentoring to guide students to successful outcomes,” said RTI’s Thom Miano. “Data+ gives students an opportunity for valuable experience and employers an avenue for potential recruitment.”

“Medical applications, robots, infrastructure, new materials—it’s hard to think of an area of engineering that’s not going to be driven by data. At Duke, we’re training students to navigate in the digital economy not just by teaching them how to analyze data, but how to ask the right questions. The world needs people who know how to put the chain together.”

Robert CalderbankRobert Calderbank
Charles S. Sydnor Professor of Computer Science
Professor, ECE, Math and Physics
Director, Information Initiative at Duke