Methodology for data analysis in medical sciences

author: Janez Stare, Faculty of Medicine, University of Ljubljana
produced by: S.TV.A.d.o.o.
published: Sept. 23, 2011,   recorded: September 2011,   views: 4390

Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.


The topic of our research program is methodology for discovering actual or possible patterns, trends and assotiations in medical data. These are the methods for discovering new knowledge or generating new hypotheses that may lead to knowledge. The data that we are analysing manly arise from research, but also from routine practice in medicine. In the recent years, we are paying special attention to methods for generating hypotheses from bibliographic data. We are also cautiously expanding the scope of our research, presently into the field of electric stimulation of smooth muscles and associated electromiography. In brief, our research can be divided into three sub-fields:

  1. Biostatistics
  2. Scientometrics
  3. Data mining in v bibliographic databases

The focus of our research in biostatistics is on regression models for survival analysis, especially the Cox model. In addition to explained variation, prognostic value and frailties, which have so far been our main topics related to the Cox model, we will concentrate our efforts on time-varying coefficients and testing specific alternative hypotheses, such as crossing hazards, during the forthcomming five-year period. The presently available tests do not distinguish the later situations from the null hypothesis. Beside the Cox model, we will continue studying intensely the field of relative survival, where we have recently developed an entirely new method. We will also extend the scope of our research outside survival analysis. We will investigate the methods for assessing goodness-of-fit of the logistic regression model. The present approaches are based on unit grouping, which has several disadvantages. Our approach will be based on application of results from the theory of stochastic processes, especially Brownian motion.

Research in scientometrics is relatively new in general, while it is virtually nonexistent in Slovenia. It is almost impossible to conduct such research without an adequate bibliographic database, so the Biomedicina Slovenica is of fundamental importance for us. Another indispensable tool is a system for automated citation analysis, which we have also developed. The third key factor is selection of appropriate indicators, which has also been a field of our experiseje for a number of years. It is only the combination of these three components that gives one with the possibility to work on research evaluation, even though the methodological problems do not end there. Namely, all the various bibliographic databases are organised in a way that prevents the usage of standard data-analytic approaches. Hence, we have developed a system based on OLAP (On Line Analytical Processing) methodology that transforms data from bibliographic databases into a multidimensional orthogonal structure, which can then be analysed by means of the usual (statistical) methods. Two of our staff recently published an article on this in Scientometrics, the leading journal in the field. During the next five years, we will be mainly interested in research trends in Slovene medicine, as well as the ifluence of the number of authors, inter-institutional co-operation, authors' citation history and other factors on the impact of publications.

Data mining in bibliographic databases is a novel approach to browsing such databases. So far, we have developed a system for supporting biomedical discovery. The system aids researchers in creating new hypotheses, which can then be tested using the established research methods. Our approach treats hypotheses as relations between biomedical concepts that have not been published in the scientific literature yet.The core of the system is the Medline bibliographic database, which is joined with the LocusLink, HUGO, OMIM and UniGene genetic databases in the present version. This makes the system particularly useful for discovering new relations in the field of genetics, such as predicting candidate genes for a new disease.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: