The speaker presented about an application of leveraging machine learning techs to diagnose heart diseases automatically. Currently those diagnoses mainly rely on the experience of physicians, and it is usually slow for emergency visits. Machine learning techniques can help to speed this up, however, currently there are no differential diagnosis models for heart diseases available for emergency departments. The speaker proposed a tree-based model that is capable of being built from structural data. By leveraging this rule-based model the speaker successfully differentiated the 6 types of heart diseases with health record data with NLP techniques included. From the results presented, it is a model with high potential to be used in real conditions, however, currently I think the recall of this model is still subject to improvement.
AI and conversation bots are one of the most competitive domains in computer science. Such applications utilize both machine learning and NLP techniques to build intelligent dialog machines, which can make conversation with human beings. The speaker in this talk has given the introduction of the current progress in making conversation bots, and discussed about virtual humans and the interactive conversation technologies. He shared some promising applications where we may play “conversation games” with those virtual humans. The type of conversation is not easy ones of daily talk, but with some domains that require expert knowledge medical conversations or interview talks.
The speaker of the talk gave the presentation of her project of event detection leveraging attention based approach. Nowadays a large amount of event data are out there in many domains such as social media, health record, and e-commercial webs. The understanding of past events may help us to anticipate future events. The problem and challenge here are that, given a sequence of events, how do we predict the type and time of future events. One existing mathematical tool for modeling sequences is point process, but it has the drawback of strong assumption on the generative process that may not reflect the reality. In this talk, the speaker presented an RNN based model with an attention layer to automatically learn the underlying dependencies among events from the event sequence history. By giving results from real-world data, the speaker fully demonstrated the potential of the model.
The talk focused on the data mining on clinical data. With the increment of medical research, the clinical and biomedical data is also increased in amount, however, with the trade-off in the difficulty of uncovering new knowledge. Machine learning techniques provide a possible way of discovering new knowledge only from the data. The challenges here include the capture of key information within text and standardization issues, requiring the usage of NLP techniques and data integration techniques. With all these in combination, data mining frameworks have already been used in discovering knowledge from texts. The talk presented a series of cases with illustrations of extracting, integrating, discovering and visualizing knowledge. Collectively, these cases demonstrate the usage of systematic processes and development of open-source tools for transforming clinical and biomedical data into knowledge.
This webinar discussed of the cloud computing environment topic for online learning environments. It offers developers of learning environments access to unprecedented amounts of learner data. The appearance of remote hosts and cloud services makes it possible for building data-driven development (D3) of learning environments. In such frameworks, the environment continually crawls data from interactions, that is used in ongoing evaluation and iterative development. Such iterative structure makes many online machine learning frameworks faster with the gradually updated data from learners, to have an up-to-date model always. In the webinar the speaker presented various case studies of deploying the D3 environment and the potential of utilizing the data from it.
“Science is built on consistent, reliable, repeatable findings. Across disciplines, today’s academic researcher builds specialized models, runs complex simulations, and manages huge datasets. This often requires the processing and storing hundreds of terabytes of data. For innovation to happen at the speed of light, tremendous compute power is needed.”
As claimed by the speaker, the future of GPU computation is imminent. As an architect at NVIDIA, the speaker introduced the solutions given by them, and the method of optimizing the research institution’s infrastructure to take advantage of scalable tools that enable faster, more powerful computing. From the talk, I got basic understandings of the architecture of GPU computing and understood some naive methodology of leveraging the hardware.
The talk is about a novel application of sparse k-means clustering model. In the beginning, the speaker presented a model of sparse k-means clustering with group lasso normalization for omics data, which is from multiple datasets, for instance, gene expression, RNA expression etc. By combining them together, the application of the model can help in discovering new subtypes of diseases. Then in the second part, they focused on the problem of model organisms which is used as the replacement of human organs in the medical study. Recent contradictory reports on whether mouse models mimic human in transcriptomic response have created debates on the usefulness of animal models. The speaker has developed a statistical evaluation framework with functional characterization for comparison of differential systems across model organisms or across species.
The talk is about leveraging the data science tools to solve biomedical problems. There are many aspects in such domain that machine learning can help, such as the detection, diagnosis, treatment, and prevention of diseases. The speaker went through many detailed and technical problems in analyzing such problems, including the sparsity problem, redundancy problem, and data quality. By addressing this challenge, the speaker and his team have done a lot of fundamental works in deep neural networks to accelerate the training and achieving better results. Specifically, a optimization method proposed named distributed asynchronous stochastic gradient and coordinate descent methods impressed me, that it is developed for efficiently solving convex and non-convex problems. With the models, they played with many datasets such as EMR(Electronic Medical Records) and TCGA(The Cancer Genome Atlas ), and achieved many promising results.
In the talk, the speaker introduced his work of crowd modeling.
It is a complex task of encoding the collective behaviors and further construct a generative model, because of the dynamic nature of the system of the crowd. The natural state space of endogenous factors, such as group topology, and exogenous factors, such as the presence and position of other objects, agents, and signals, is far too high dimensional to be experimentally probed within a laboratory setting. The speaker examined the challenge of modeling such a complex system in the context of professional basketball games. He presented an unsupervised model of synthesizing the real NBA team defense moves. The final work if this is promising and may confuse a human to judge whether a pattern is simulated or real. In the end, the potential of such modeling in a smart city is discussed. e.g. construct a model of cars in intersections to predict where they would go and anticipate potential accidents.
This talk is a webinar given by Lynette Hoelter. In the webinar, a project powered by ICPSR regarding the data fairness is introduced. The attention on data sharing has focused ethics discussions on the informed consent process, but collecting, sharing, and reusing data actually involve a series of potential ethical considerations. There could be various potential issues and problems during data sampling, data sharing, and data analyzing. The establish of GDPR in Europe has raised a lot of such ethical questions to all over the world. In this talk, the speaker focused on the ways in which decisions about sampling, question-wording, and even analyzing data can have ethical implications. Many detailed issues for researchers using social media data are discussed, and it provides a very good guideline (though not official), for me when taking advantage of those data.