Data Science Student Lightening Talks

Title: Data Science Student Lightening Talks Presenter(s): Trace Freeman, Jacob Haarala, Hayden McDonald, Lydia Sloan, Benjamin Zamzow, and Karl Schubert Date: November 17, 2021 Description During this webinar we will hear from five of data science students working in the Education Research Theme. Overview Discussion, presented by Dr. Karl Schubert Course Equivalency Project, presented by…Continue Reading Data Science Student Lightening Talks

Towards Robust Machine Learning under Distribution Shift and Adversarial Attack

October 27, 2021 (Xintao Wu)

As big data and AI technologies are deployed to make critical decisions that potentially affect individuals (e.g., employment, college admissions, credit, and health insurance), there are increasing concerns from the public on privacy, fairness, safety, and robustness issues of data analytics, collection, sharing and decision making. In this talk, we first overview our social awareness research, in particular, on how to mitigate side effect of enforcing one social concern on another, and how to address multiple social concerns simultaneously. We then focus on robustness of machine learning under two representative scenarios, distribution shift and adversarial attack. In the former scenario, we present robust learning based on kernel reweighing and Heckman model. In the second scenario, we present adaptive defense that purposely leverages multiple types of adversarial samples to learn the context information in the training. We conclude the talk with some future research directions….Continue Reading Towards Robust Machine Learning under Distribution Shift and Adversarial Attack

Data Curation and SARS-CoV-2: population genomics of 2.5 million genomes

Covid-19 has become a global pandemic, and recently Arkansas has seen a dramatic increase in number of cases, mainly due to a new variant (“Delta”). Using a population genomics approach, we are in the third wave with the current Delta variant accounting for about 83% of the strains sequenced. This has been preceded by the Alpha variant, which peaked in March of this year (2021), and another less characterized variant (Janus), which peaked in September 2020. Each of these variants has become better adapted for infecting and spreading within the human population….Continue Reading Data Curation and SARS-CoV-2: population genomics of 2.5 million genomes

Education, Outreach, and Workforce Development

Through DART, we plan to implement a wide range of professional development and data science education activities to engage K20 learners in Arkansas. Our vision is that Arkansas will have a statewide educational ecosystem, where learners of any age can receive a designed, consistent, scaffolded education in data science, with further educational opportunities or job opportunities at appropriate points in their careers. To accomplish this, our mission is to create a model data science and analytics program for Arkansas schools that will promote problem-based and experiential-based pedagogy in critical thinking and analysis, technology familiarity, and a foundation in math and statistics….Continue Reading Education, Outreach, and Workforce Development

Learning-based Approaches to Data-driven Predictions

A major challenge in building secure and widely adopted deep learning systems is that they sometimes make wrong, unexplainable, and/or unpredictable misclassifications. This talk overviews initial efforts towards techniques using large-scale deep learning with multi-source integrated data sets. In addition, we introduce the integration of statistical learning approaches with learning-based frameworks….Continue Reading Learning-based Approaches to Data-driven Predictions

Media Matters: Innovations to Improve the Value Derived from Social Media Networks

Social media platforms have billions of active users and significantly impacted our society. New types of platforms or new features in existing platforms continue to be developed to meet users’ demands. With an increasingly large amount of unstructured social data on these platforms, social media and networking analysis research aims to develop efficient, reliable, scalable, explainable, reproducible, and theoretically grounded data science approaches to understand our digital behaviors and make it a safer and valuable place. Our talk will focus on collective opinions and their evolution, deviant behavior modeling, automatic annotation of multimedia data, and informing disaster response with social media….Continue Reading Media Matters: Innovations to Improve the Value Derived from Social Media Networks

Socially Aware Data Analytics

There are increasing concerns from the public on privacy, fairness, safety, and robustness issues of data analytics, data collection, data sharing, and decision making. The social awareness thrust team will present their cutting-edge research on socially aware data analytics that can address social concerns and enable big data analytics to promote social good and prevent social harm….Continue Reading Socially Aware Data Analytics

First Steps toward a Data Washing Machine

Data has a life cycle from planning to acquiring, cleansing, storing & sharing, integrating, application, and disposing. While AI and machine learning have taken the application of data to new levels, the other phases remain largely manually mediated processes. The research goal for the Data Life Cycle and Curation thrust is to develop fully automated processes for the other phases of the data life cycle. The presentation today describes some of the progress of the research finding ways to automate data cleansing and data integration phases of the data life cycle….Continue Reading First Steps toward a Data Washing Machine

DART Cyberinfrastructure Resources

Dr. Fred Prior, Dr. Chris Angel, Dr. David Chaffin, and Dr. Pawel Wolinski will present an overview of the Coordinated Cyberinfrastructure efforts on the DART project, including providing secure, distributed, agile, scalable, and on-demand services. This presentation will feature a demonstration using simple, “real-world” data science examples within the Arkansas Research Platform including: DART GitLab repository, Globus file transfer, and Pinnacle-portal….Continue Reading DART Cyberinfrastructure Resources

Welcome to DART

Dr. Jackson Cothren will present an overview of the National Science Foundation-funded EPSCoR Track 1 award entitled, “Data Analytics that are Robust and Trusted (DART): From Smart Curation to Socially Aware Decision Making.”…Continue Reading Welcome to DART