By Dave DeFusco
On a living room floor, a parent and child sit together with a few simple toys. There is laughter, movement, pauses and small moments of connection that might look ordinary to an outsider. To pediatric clinicians and researchers, however, these moments of play are rich with meaning. They reveal how a child grows, how a parent supports that growth and how relationships are built through shared experiences.
, 鈥淎I-powered play assessment approach using video language models: A feasibility study,鈥 to be published in Smart Health in March by researchers in the Katz School鈥檚 Department of Graduate Computer Science and Engineering and School of Health Professions at Rutgers University, explores a new way to understand these moments more efficiently and consistently. The study brings together experts in occupational therapy, artificial intelligence and mathematics to ask a simple but powerful question: can computers help professionals better understand how parents and children play together?
Joint play between a parent and child is more than fun. It helps children develop thinking skills, emotional regulation, communication and social bonds. For decades, occupational therapists and other clinicians have used careful observation of play to understand a child鈥檚 needs and decide how to support them.
One widely used tool is the Parent/Caregiver Support of Children鈥檚 Playfulness (PC-SCP), developed by Amiya Waldman-Levi, co-author of the study and director of the Playfulness, Growth & Development Laboratory at Rutgers University, and Anita Bundy of the University of Colorado. Trained professionals watch a video of a parent and child playing together and score how the adult supports the child鈥檚 playfulness across 16 areas, such as encouragement, flexibility and emotional connection. These scores guide therapy, education and family support.
This process, however, takes time and deep training. Watching videos, learning the scoring manual and ensuring consistent ratings across professionals can be slow and tiring. To address these challenges, the research team developed AI-powered software that can analyze short videos of caregiver-child play.
The system uses a type of artificial intelligence called a deep neural network, which is designed to recognize patterns in complex data like images and videos. The computer 鈥渨atches鈥 the video frame by frame and looks for behaviors that match the PC-SCP guidelines. The researchers tested advanced video models known as Video Large Language Models, which can understand both visual scenes and instructions written in words.
鈥淲e wanted to see if AI could support clinicians, not replace them,鈥 said Waldman-Levi. 鈥淥ur goal was to explore whether technology could help make play assessments more efficient, objective and accessible while still honoring the complexity of human relationships.鈥
The team recruited 37 parent-child pairs, including both neurotypical children and children with developmental differences, ages 1 to 6. Parents recorded 10- to 15-minute videos of natural play at home, using everyday toys and routines.
Two occupational therapy doctoral students from the Katz School, Chana Cunin and Vanessa Murad, were trained to score the videos manually using the PC-SCP. Their careful work ensured that the human scores were reliable and accurate, providing a strong standard for comparison. At the same time, Dengyi Liu, a Ph.D. student in Mathematics at the Katz School, helped design and evaluate the AI system. Under faculty guidance, the team tested several video models to see how well they could recognize parental behaviors and assign scores.
Among the models tested, a fine-tuned version called Qwen2.5VL performed best. It correctly matched human scoring about 38 percent of the time when choosing a single exact score, and about 61 percent of the time when its top five predictions were considered. While those numbers may sound modest, the researchers stress that the task itself is extremely complex.
鈥淭hese models are being asked to understand subtle human behaviors, emotional tone and interaction patterns from a small number of videos,鈥 said Honggang Wang, co-author of the study and chair of the Katz School鈥檚 Graduate Department of Computer Science and Engineering. 鈥淕iven the limited data and the nuanced nature of parent-child play, the results show real promise for this approach.鈥
Wang emphasized that AI models improve with more examples and better training. 鈥淎s datasets grow and models become more specialized,鈥 he said, 鈥渨e expect significant gains in accuracy and reliability.鈥
Automating parts of play assessment could save clinicians hours of work, reduce burnout and allow more families to receive timely support. It could also improve consistency across clinics, research studies and training programs.
鈥淭his research is about equity and access,鈥 said Waldman-Levi. 鈥淚f we can use AI responsibly, we can help ensure that high-quality assessments are available to more children, families and clinicians regardless of location or resources.鈥