Symposium of Yotta Informatics - Research Platform for Yotta-Scale Data Science 2022

15:10-15:40 Invited: Mitsuyuki Nakao (Tohoku University)

From Yotta to Unprecedented Scale Informatics

Data triage based on their “value” is a central concept of Yotta informatics, which comes from common risk awareness of big data overload. Therefore, Yotta informatics substantially requires value evaluation of data, which energizes interdisciplinary collaborations between humanities, science, and technology. Actually, these interdisciplinary collaborations are sure to make further progress in extensive areas of study. In a different context from the triage, one of the expected benefits of the data-driven approach is that latent and even counter-intuitive relationships and/or causality could be disclosed, which might lead to updating the knowledge systems on which the value evaluation relies. This benefit of data-driven approach is enhanced by sensing and processing the information and its materialistic properties (e.g., text and paper/fonts/ink, code and underlying physics) indistinguishably, turning off the function of the knowledge systems temporarily. This is because re-organization of the knowledge system is facilitated by resetting the distinction between the information and its associated materialistic properties. Tohoku university is becoming a unique node for data-driven science. In addition to the Tohoku Medical Megabank Project and large-scale measurement equipment such as a cryo-electron microscopy, Tohoku University is the first national university to have an on-campus next-generation synchrotron radiation facility, which will be fully operational from fiscal 2023. These facilities are producing "unprecedented scale" data. The term "unprecedented scale" here does not simply refer to the size of the data. It refers to data that far surpasses conventional scales in terms of the speed at which data is brought to us, the resolution of data measurement and generation, and the diversity of data modalities. In order to create innovation by developing advanced analytics for unprecedented scale data generated from the aforementioned large-scale facilities as well as from various research activities, the Unprecedented scale Data Analytics Center (UDAC) was established on January 1, 2022 as the third center of the Organization for Innovations in Data Synergy. Note that the unprecedented scale data in their nature might include the parts outside of the existing systems of knowledge, and therefore development of analytics extracting information from them should be a further challenge originated from Yotta informatics. Taking over the concept of Yotta informatics, UDAC intends to provide an advanced and cutting-edge AIMD (AI, Mathematics, and Data science) human resource development environment by making use of its achievements.

15:40-16:10 Akinori Ito (Tohoku University)

Towards spoken dialog system development without speech recognition for conservation of endangered language

Many languages all over the world are in danger of extinction. Therefore, there have been many efforts to record and describe the language while speakers are present. However, merely recording and describing a language is insufficient to revitalize the language. This project aims to develop a system that enables a learner to make a conversation to the system with the endangered language. Although an ordinary spoken language system uses speech recognizer and synthesizer for conversion between speech and text, it is almost impossible for an endangered language to develop a speech recognizer and synthesizer. Therefore, we develop a system that does not use speech recognizer and synthesizer and yet realize spoken dialogue using recorded speech material of a language. In the presentation, a speech matching method and its performance is explained as a preliminary results.

16:15-16:45 Shinya Nakamura (Tohoku University)

Dissociable functions of the monkey dorsolateral prefrontal and premotor cortices in working memory revealed by functional disturbance with rTMS

The dorsolateral prefrontal cortex (DLPFC) is crucially important for the retention of information during a short period of time, i.e., working memory. The premotor cortex (PMC), which is connected to DLPFC, play a key role in planning and preparing a specific motor action on the basis of incoming information from DLPFC. These brain areas should work together for accomplishing a complex behavior. Here we show that DLPFC and PMC differently contribute to the performance of a spatial delayed response task in which monkeys are required to retain the visuospatial information necessary for the following behavior in their mind for a while. In this task, one of eight buttons arranged in a circle was illuminated as a cue, and pressing that button after a delay period was rewarded. We delivered repetitive transcranial magnetic stimulation (rTMS; 10 Hz, 10 pulses) unilaterally to either DLPFC or PMC during a delay period of the task for temporally disturbing the neural activity. Both DLPFC and PMC stimulations significantly impaired the task performance in a delay-dependent manner, i.e., the longer the delay period was, the severer the impairment was. In the DLPFC stimulation, the impairment was observed when the target button was located at the hemifield contralateral to the stimulated hemisphere regardless of which hand was used. In the PMC stimulation, the impairment was observed when using the contralateral hand of the stimulated hemisphere regardless of the target location. Thus, the impairment caused by the DLPFC stimulation was visual field dependent, whereas that caused by the PMC stimulation was effector dependent. We further found that the effective timing of rTMS during the delay period was different between DLPFC and PMC. The effect of the DLPFC stimulation was significant in the early phase, whereas that of the PMC stimulation was significant at all timing parameters we tested. These results clearly demonstrate that DLPFC and PMC function as different functional elements for successful performance of the spatial delayed response task.

16:45-17:15 Yoshiyuki Sato and Satoshi Shioiri (Tohoku University)

Prediction of subjective evaluation for images with facial expression analysis

The number of images created in the world is ever increasing, and it is essential to develop a technique that can recommend images preferred by a user without imposing much effort to the user. In this study, we used machine learning models to estimate human preferences for images from spontaneous facial features extracted from video recordings of faces while they are performing a natural preference evaluation task. We use distinct image categories and compare the results between image categories. We show that the machine learning models can predict subjective evaluation better than human raters. We also show the preference is predicted from distinct facial features across different image categories.

17:20-17:50 Tomoko Imoto, Yuichiro Hoshino, Koji Yoshino, Hina Kashiwaba, Haruka Katsumori, Yoshiyuki Sato, Yusuke Ohsaki, Hitoshi Shirakawa (Tohoku University)

Eating Habits and Psychological Aspects of College Students during COVID-19 Pandemic

The COVID-19 pandemic has had a huge impact on eating habits and psychological aspects. Some first-year students leave their hometown, and they start living alone and choosing their own foods. In addition, college students have been damaged by unusual college life, for example, online class, cancellation of extracurricular activities. Assuming that the categorization of traditional diets and Western diets, which has been revealed in previous studies of dietary habits, may be difficult to apply to the dietary habits of university students, or may be more biased. Hierarchical cluster analysis was applied to search for cross-item food intake patterns. We conducted a web questionnaire survey of students at T University regarding their awareness of diet and health, their eating habits, and their mental health. The survey period was from December 2020 to January 2021, and the number of valid respondents was 101. For dietary status, the method of asking the frequency of each item was applied, and for mental status, the index (UPI) for measuring the mental health of university students was applied. As a result of hierarchical cluster analysis, the degree of mental health in each group obtained was clarified, and the characteristics of each were shown. The most balanced group has good mental health, the second balanced group eats fermented foods and vegetables a lot but also snacks a lot, has poor mental health. Their health consciousness is high, so they would start to eat fermented foods and vegetables a lot to prevent covid-19 from their anxiety. The group intake fried food, instant and snack has poor mental health.

17:50-18:20 Yoshitaka Kanomata (Tohoku University)

Archaeological interpretation on the emergence of pottery in Japan

The reason for the emergence of pottery has been a major concern in archaeological field all over the world. Since the 1960s, the Japanese archipelago has been attracting attention as one of the oldest places where pottery emerged. In the Fukui cave, Nagasaki Prefecture excavated in 1960, radiocarbon determinations associated with the oldest pottery were measured to be 12,000 BP. The result made the name of this site known worldwide. Several dates of carbon adhesions on ceramics dated to older than 13,000BP at the Odai-Yamamoto 1 site in Aomori Prefecture excavated in 1998. Since the 2000s, older potteries have been found at several archaeological sites in China, and there is increasing interest in East Asian trends. In this research, various data related to the emergence of pottery was collected. The reason and background of the phenomenon is estimated from these databases. The presentation focuses on the practical research being conducted at the Hachimori site in Yamagata Prefecture.

18:20-18:50 Takaki Sato, Anqi Li and Yasumasa Matsuda (Tohoku University)

Dynamic panel analysis of subjective well-being in the COVID-19 outbreak in Japan

We conducted happiness surveys for around 22,000 respondents all over Japan in the 7 periods in Dec., 2019, Sep., Dec.2020, Mar., June, sep. and Dec. 2021. In this talk, we shall report the influences of COVID-19 on subjective well-being evaluated based on the surveys before and after the outbreak. We applied a dynamic regression model that describes joint effects of individual and spatial factors to visualize space-time behaviours of Japanese subjective well-being. Namely we quantified the factors of happiness driven by individual factors, which are age, sex, income and so on, and those by spatial factors in prefectural levels after controlling the individual ones. Examining the dynamic changes of the individual and spatial factors on the 7 periods, we see that the COVID-19 outbreak in Japan has damaged the subjective well-being of young females most seriously and the crucial damages still are continuing especially for the low income group in them.

19:00-20:00 Invited: Peter G. Moffatt (University of East Anglia)

Modelling subject heterogeneity in experimental data

The use of controlled experiments in Economics has grown exponentially in recent decades, and a major reason for this explosion of interest is that the experimental approach averts a number of serious problems that have preoccupied econometric researchers for many decades: low sample sizes, missing values, omitted variables, measurement error, endogeneity, and so on. The idea is that in a controlled experiment, the sample size set by experimenter to attain desired power, the sample is drawn randomly from the population, the treatment is assigned randomly, the effect of a single treatment is isolated with other effects eliminated, and all variables are measured without error. However, one significant econometric problem cannot be avoided: between-subject heterogeneity. Subjects vary in many dimensions: risk preference; concern for others; cognitive ability; etc. Hence it is inevitable that subjects will respond in different ways to the same stimulus. Ways of addressing this problem is the theme of this presentation. It is useful to consider two different types of heterogeneity. The fist is discrete heterogeneity. This is the situation in which the population of subjects divides into a small number of distinct types. The finite-mixture approach is the natural estimation framework in this setting. One widely seen example is the level-k model, in which subject types are defined by their levels-of-reasoning in interactive games. The applications used as examples will be the 11-20 money request game, and the -Beauty Contest game. The second type of heterogeneity is continuous heterogeneity. This arises when there is continuous variation over the population in a preference parameter such as risk aversion, probability weighting, or discount rate. The method of Maximum Simulated Likelihood (MSL) is promoted as the best framework for modelling in this situation, particularly in cases in which there is more than one dimension of heterogeneity. The applications used as examples will be individual choices over lotteries, and behaviour in 2-player games.

15:00-16:00 Invited: Sachiko Kiyama (Tohoku University)

Neural and psychological evidence for emotional experience via poetic language

Words elicit emotional arousal based on their affective properties. Poetry, the art of words, represents beauty and humour through sophisticated language forms that have evolved in every culture. Particularly, fixed verse, which refers to a type of poetry with a specified number of syllables along with the required rhyming, is based on the interplay of words and ethnic rhythm, with which poets pursue beauty and the truth of the world. This presentation will highlight current work to investigate how Japanese haiku and senryu, the shortest fixed verses in the world literature, inspire human emotional life.

16:10-16:40 Yoichiro Tanaka (Tohoku University)

Computational Storage Platform for Brain Neural Structure Analytics

The research status of the new computational storage platform for brain neural structure analysis will be presented. This is the inter-discipline academic approach to unveil the microscopic and multi-scale neuron topological analytics as well as the functions with intelligent computational storage scheme. In such life science analytics where handling large scale data analytics and secure management of unstructured datasets, unification of compute and storage in close proximity to data source is required.

16:40-17:10 Takeshi Obayashi and Minoru Ikeda (Tohoku University)

Kinship Analysis on wild populations of olive flounder

The fishery is the foundation of food culture and the local economy in Japan. Because the marine fish resource is globally declining, its proper management is essential for a sustainable fishery. Stock enhancement programs are practiced globally in marine coastal areas and typically involve stocking hatchery-reared fish into the native environment. For example, in olive flounder, up to 20 million artificial fingerlings are stocked annually in Japan. As a result, the stocked fish have reached 10% of the total catch. However, such a massive number of released individuals may disturb the natural genetic diversity, including kinship between individuals in the wild stock. We examined methodologies for estimating the kinship of wild olive flounder using microsatellite data and obtained two results. (1) A famous program, COLONY2, can predict kinship in densely related populations with high accuracy, but this program is not suitable to detect kinships in sparsely related populations. (2) Permutation test is a valid approach to control a false positive rate of kinship detection against a sparsely related population.

17:10-17:40 Shinichiro Omachi (Tohoku University)

Character recognition of historical Japanese documents considering character structures

Historical documents play an important role in various research fields. However, it is difficult to recognize characters in the historical documents with machine learning techniques because some characters have only small number of samples. We propose a character recognition method that focuses on the character structures to complement the shortage of training data.

17:50-18:50 Invited: Hsiu-Ping Yueh (National Taiwan University)

Exploring Learning and Reading Behavior - From Observation, Tracking to Modeling

TBA

18:55-19:55 RIEC Nation-wide Cooperative Research Project Session

Ikuhisa Mitsugami (Hiroshima City University)

Gaze Estimation and Analysis for Consumer VR Goggles

"Metaverse" (experiencing the 3D world by VR goggles) is coming not only into communication and commercial purposes but also education fields. For assessing the degree of understanding of each student under such environments, his/her gaze behavior is an important clue. We introduce some recent achievements about gaze estimation and analysis for this purpose.

Akinori Ito (Tohoku University)

The Virtual Classmate Project: Incorporating Spoken Dialogue Technology into Online Lecture

With COVID-19 pandemic, online lecture have became more and more popular to prevent infection in a class. However, students who take online lecture, especially on-demand-style lecture, need more effort than a real lecture to concentrate attention to the lecture material. According to the survey, the complete rate of MOOC (Massive Open Online Courses) is less than 10%. To enhance the concentration to the lecture, we are developing a spoken dialogue agent that interacts with a student who watches the lecture video. In this presentation, the overview and plan of the project is presented.

Renjun Miao, Haruka Kato, Yasuhiro Hatori, Yoshiyuki Sato, Satoshi Shioiri (Tohoku University)

Estimation of attention states using facial expressions for online lectures

Online lectures such as massive open online courses are becoming popular and familiar. It is difficult for teachers to know whether students are concentrating on the contents due to the lack of interactivity in online lectures. The present study aimed to develop a method to estimate the state of attention from facial images while participating in online courses. We propose a method to estimate attention states using facial expressions for online lectures in this study. For the purpose, we conducted an experiment for reaction time measurement of a target the was contents irrelevant noise. We attempted to predict reaction time, which is an index of attention states, from facial features during the experiment, using a machine learning method. The method predicted the reaction time results in some amount, suggesting the usefulness of facial features to estimate attention states.

Haruka Kato, Koki Takahashi, Yuta Horaguchi, Yasuhiro Hatori, Yoshiyuki Sato, Satoshi Shioiri (Tohoku University)

Predicting attention states during calculation in mind by facial expressions

Education can be improved by digital technologies through estimating mental states of learners. We developed a method to estimate mental states of learners using facial expressions. In order to investigate the relationship between facial expressions and attention states, we asked participants to calculate addition or subtractions of pairs of 3- or 4-degit numbers in mind while measuring face videos, A machine learning technique was used to predict hit rates and response times from facial features, The method showed a significant level of predictions of hit rates and response time. This suggests that facial features are useful to estimate mental states related to brain processes during calculation in mind.

Back to Top