FLIGHT DELAY CAUSES AT SELECTED VISEGRAD GROUP INTERNATIONAL AIRPORTS

The aim of this article is to analyse the flight delay causes at base airports (Prague, Brno, Ostrava, Budapest, Bratislava, Katowice, and Warsaw), with a special focus on a selected airline company operating in the central European region. To process the data, methods of multivariable statistics, namely tests of independence in contingency tables, the Kruskal-Wallis testing, cluster analysis, and correspondence analysis were used. Apparently, both charter and scheduled flights have the same percentage of delayed flights, delays occur most frequently in June, and Boeing 737-800 reported delays more frequently than Airbus A320. The research has shown that the highest number of delayed flights occurs in Budapest, the lowest number in Katowice. During the night, short delays occur most often, long delays most frequently arise in the evening. The most common cause for longer delays is technical maintenance or an aircraft defect and previously delayed flights. The flight dispatch by supplier companies is the source accounting only for rather short delays. Overall, the delayed flights frequency increases with the size of the city and the airport.


INTRODUCTION
The passengers are sometimes, in cases of very significant flight delays, entitled to financial compensation under certain conditions. This concerns rather high amounts of money which may represent a considerable expense for airline companies. It is therefore necessary to try and eliminate the delays, especially the long ones, so that airlines would not be obliged to pay financial compensations to passengers. At the same time, elimination of delays would improve customer experience, as nobody enjoys long waiting times at airports. Rights of passengers in the air transport are stipulated in Regulation (EC) No. 261/2004 of the European Parliament and of the Council; for more details see European Consumer Centre Czech Republic (2020).
The principal objective of this work is to evaluate and assess the delay-caused problems at selected airports in the countries of the Visegrad Group. Airports have been selected based on the results of cluster analysis and internal information of the airline. These are the so-called Base Airports -airports that serve as an airline's home base with full facilities and personnel. In the first step, all flights at selected airports were analysed: this included differences between charter and scheduled flights, delays at specific airports, delays of different aircraft, and times of delays. Statistical hypothesis testing and the Kruskal-Wallis test proved useful in the identification of statistically significant differences. In the second step, the focus was on selected airports and delayed flights. This entailed a detailed (correspondence) analysis of delays -considering their length, time of occurrence, reasons and so on.
This study evaluates the causes of flight delays at base international airports (Prague, Brno, Ostrava, Budapest, Bratislava, Katowice, and Warsaw) used by a selected airline company operating in the Czech Republic. The causes of delays were classified based on the codes of IATA (The International Air Transport Association), adapted for the specific needs of this company; see Tab. 1. The dependences of delay causes on other factors were examined by means of independence test in contingency tables. Correspondence analysis was employed in order to display the results graphically.
According to the authors Wang et al. (2019), delays in air travel cause economic losses for airlines and reduce the quality of travel. The analysis of the causes of delays is performed here by the methods of statistical physics. A delay represents an issue that affects both passengers and the airport staff, this issue being addressed, for example, by Wu and Truong (2014), Zámková et al. (2018). They came up with a comparison of the IATA delay data system with the coding system developed by the authors themselves. The article by Skorupski and Wierzbińska (2015) deals with the difficulties encountered due to late checkins and looks for an optimal time limit after which it is appropriate to stop waiting for the latecomers. The authors Jiang and Ren (2018) propose a model that can effectively describe the behaviour of passengers at various delays. The author Stone (2018) found that flight delays or cancellations have a negative impact, especially on passengers at small airports (locals and tourists), who then have to travel to the transfer airport instead of departing from these small airports. This further increases the impact of these passengers on the entire travel itinerary. According to Forbes et al. (2015), it would be advisable for airlines to release information on delayed flights with a delay of more than 15 minutes. Further research focused on the modelling of the course and propagation of delays during subsequent flights, see, for example, Campanelli et al. (2014), Rebollo and Balakrishnan (2014). Optimization of delays is seen as a solution in articles by AhmadBeygi et al. (2008), Wang et al. (2020), Wu et al. (2016) and Belkoura et al. (2016). The authors of the article Wu and Law (2019) developed a model describing the propagation of delays to subsequent flights using the Bayesian network. The authors Pamplona et al. (2018) propose procedures for optimal air traffic control to predict delays. In doing so, they use neural network methods. Research of authors Serhan et al. (2018) studies the effectiveness of incorporating airline and passenger delay cost into an integrated airport surface and terminal airspace traffic management system. Problems with lost luggage are discussed in the article by Alsyouf et al. (2015).
The research by Zámková et al. (2017)  the aircraft's delay on the previous flights. Plus, the later in the day, the more delays caused by this reason occur. Compared to the abovementioned paper by Zámková et al. (2017), the current research uses more recent data (2015) only from major Visegrad Group airports, allowing for the generalization of results to fit the airports in Central Europe. This paper starts with an analysis of all flights operated by the selected airline in the given period and selected V4 destinations, regarding: Total number of delayed flights, number of delayed flights considering the flight type, aircraft type, and the time period (summer season of 2015). Next, statistical testing (including the Kruskal-Wallis non-parametric test for abnormalities in the distribution of the random variable) allowed for the assessment of statistically significant differences between the groups. This was followed by cluster analysis, pinpointing the similarities in V4 departure destinations, and explaining the selection process. From there on, only the selected airports have been under a more detailed review considering the delay causes (correspondence maps and column relative frequencies).

METHODOLOGY AND DATA
Primary data cover the peak season of the selected airline (from 1 st June 2015 to 30 th September 2015) and include information on the length of delay as well as the delay causes. The data were obtained from an internal database of the observed airline. A substantial part of the data is categorial or suitable for categorizing. The processed data included the following information: departure date, aircraft type, flight type (charter, scheduled, etc.), place of departure, departure time, length of delay, and cause of delay.
The character of analysed data has determined the use of the corresponding independence tests. Řezanková (1997) claims that contingency tables of the r × c (where r is the number of rows, while c of columns) most often require the use the Pearson's chi-square test.
Correspondence analysis is an effective tool enabling the display and summary of a set of data in two-dimensional graphic form. It decomposes the chi-squared statistic into orthogonal factors. The distance existing between the single points is called the chi-squared distance. The interval between i-th and i ′ -th row is where r ij represents the components of row profiles matrix R and weights c j correspond to the components of column loadings vector c T . This analysis serves to reduce the multidimensional space of row and column profiles and to save the original data information to the highest extent possible, see Hebák et al. (2007). The total variance of the data matrix may be measured by the inertia, see e.g. Greenacre (1984). The processing of the data was carried out in the Unistat and Statistica software. The cluster analysis allows the input data matrix set of object to be distributed into several clusters, for more details see Hendl (2006). The aim is to achieve a situation where the objects within a cluster are similar to each other as much as possible and objects from different clusters are similar to each other as little as possible. We are using the distance measure to evaluate the degree of the objects' similarity. Euclidean distance can be used for quantitative variables The most common procedure of the cluster analysis is a hierarchical clustering, i.e. creating a hierarchical sequence of decompositions, for more details see Hebák (2007). Hierarchical clustering result is best viewed as a tree diagram, dendrogram. Distances between clusters are derived from the distances between objects. There are several agglomerative procedures, e.g. Ward method based on Ward's criterion of decomposition quality, in detail see Hebák (2007).
The Kruskal-Wallis test by ranks is a nonparametric method for testing whether samples originate from the same distribution. It is used for comparing two or more independent samples of equal or different sample sizes. It extends the Mann-Whitney U test when there are only two groups. Null hypothesis assumes that the mean ranks of the groups are the same. It can be used as an alternative to the parametric oneway analysis of variance (ANOVA) when the population cannot be assumed to be normally distributed. For more details see Anděl (2005), Hendl (2006).

RESULTS
The frequency table below indicates that the lowest number of delayed flights occurs at the airport in Katowice (17.52%). The base aircraft is not so busy in Katowice (the total number of operated flights is lower), therefore the probability of delay occurrence is lower, and if delays occur, it is easier to take care of the issue. The worst situation as for delays is at the airport in Budapest where the airline has only one base aircraft which is moreover very busy (45.06%). In general, airports in smaller cities with lower traffic intensity have fewer delayed flights, see Tab. 2.
Tab. 3 lists flight delays according to the flight type. Ferry flights seem to be most frequently delayed, however, the number of such flights is very limited. Charter and scheduled flights are interestingly delayed just as often (charter 30.4% and scheduled 32.4%).
Tab. 4 illustrates the fact that delays occur most frequently in June, probably due to the start of the peak season, and in September, at the end of summer, apparently marked by longer technical checks preventing further complications.
Boeings 737-800 tend to be delayed more often than Airbuses A320 (Tab. 5). On two occasions, a replacement aircraft needed to be used (due to some type of emergency) in order to cover two flights (SUB), both of those flights were delayed, for obvious reasons. See graphic representation in Fig. 1-4. The outliers in Fig. 1-4 clearly indicate that monitored data are not normally distributed. Hence the application of the Kruskal-Wallis rank sum test (see the results in Tab. 6). See Tab. 7 for Dunn's nonparametric comparison for post hoc Kruskal-Wallis testing (only statistically significant at 5% level). The post hoc analysis confirmed statistically significant differences between Boeing 737-800 and Airbus A320 (pvalue 0.025), with the Boeings being more The selection process: only the so called "Base Airports" with complete airline's facilities have been selected (according to the internal airline info), while also considering the fact that of all delayed flights, 90% occur at the selected airports. The remaining 10% of delays occurred at the remaining V4 airports (Czech Republic, Slovak Republic, Poland, Hungary).  Now, let us focus on individual airports and the lengths of delays. Short delays under 30 minutes are less frequent at the Katowice airport and most frequent at the Warsaw airport. At other airports, short delays occur approximately in 40% of delayed flights. There are no significant differences in longer delays (under one hour); in total this length of delay concerns about 25-30% delayed flights. Delays longer than 1 hour are most frequent at the Katowice airport; see Tab. 8. The Katowice airport has the lowest total number of flight delays; nevertheless, if delays occur, they are usually longer than 1 hour. This airport has only poor technical support available; for that reason, it would be advisable to expand technical background facilities.
The correspondence map shows that the Katowice airport is situated close to long delays over 2 hours i.e. that is where long delays occur most often. It is also obvious that medium length delays between 1 and 2 hours are very frequent at the Bratislava airport; a similar result is apparent also from the higher values of relevant column frequencies. Short delays under one hour often occur at the Brno and Prague airports, see Fig. 6. Especially at the Prague airport it would be feasible to optimize and reduce delays by rearranging the timetable since there is a considerably large base aircraft. The table of column relative frequencies clearly shows that differences in the length of delay in individual months are not significant. The only conclusion is that longer delays over one hour occur least frequently in August, while when it comes to shorter delays, it is the other way round; see Tab. 9. The correspondence map states that the occurrence of delays under 1.5 hours is generally less frequent; in the graph, the points representing these values are situated aside; see Fig. 7.
The column relative frequencies tell us that short delays under 30 minutes take place most often at night and least often in the evenings. Clearly, there are longer idle times between individual flights during the night and therefore there is enough time for maintenance, and delays get shorter. Simultaneously, at night the capacity of the airspace is not limited -there is lower total number of flights. Conversely, longer delays under 2 hours are least frequent at nighttime. Generally, we can say that during the day the differences in delays are minimal, see Tab. 10.
The correspondence map shows that in the evenings long delays over 1.5 hours are frequent. Short delays under 30 minutes occur most often at night-time and quite frequently also in the afternoon, see Fig. 8. One of the possible causes is the fact that most airplanes take off from their home airports in the morning and thus there is zero delay propagation.
During the night, delays occur most often at Polish airports (Warsaw, Katowice); at night the traffic intensity is higher, plus there is a problematic logistics of spare part distribution due to the insufficient technical base, mentioned above. Night delays occur least frequently at the Brno, Budapest, and Ostrava airports. At Polish airports, delays in the mornings and afternoons are least frequent. In the afternoon, the worst situation is in Ostrava and in Brno in the morning. In Ostrava, there is only one base aircraft available and delays tend to propagate in the afternoon. On the other hand, the best situation in Brno and Ostrava is in the evenings, see Tab. 11. Especially at these airports, the traffic intensity decreases in the evenings, in Ostrava also at nights. At the same time, the idle times between departures are longer and therefore the delay optimization gets easier. The correspondence map brings similar results, i.e. the most common occurrence of delays at Polish airports is at night, in the evenings in Bratislava, and during the day at the Czech airports in Prague, Brno, and Ostrava. In Budapest, delays occurrence is spread relatively equally during the day, therefore Budapest is approximately equally distant from the delay values during the day in the graph, see Fig. 9.
Delays caused by operational reasons of the airline prevail significantly at the airport in Prague and are infrequent at other airports. Note that a frequent reason of delays in Prague is waiting for transit passengers due to the fact that high number of flights operated by this airline company is connected to a previous flight there. Plus, with regard to passengers and their luggage, delays caused during aircraft handling by suppliers are generally very rare at all airports. Delays caused by technical maintenance or aircraft defect occur most often at Polish airports in Warsaw and Katowice. This fact may be caused by insufficient service base at these airports. The best situation in this regard is at the airports in Brno and Ostrava; they have good technical support and highquality logistics of spare components. Delays caused by air traffic control occur more often at the airports in Warsaw and Budapest where the air traffic intensity is generally high. Delays caused by airport restrictions are most frequent The table of column relative frequencies shows that delays caused by operational reasons of the airline prevail in the month of September. Plus, it shows that delays associated with passengers and their baggage and delays caused by supplier companies during aircraft handling are very rare during the reference period. Delays caused by technical maintenance or aircraft defect are most frequent in July. As July is usually considered to be the peak season, more problems occur, and more frequent maintenance is necessary. Delays caused by operational control and crew duty norms do not differ significantly during the period under review and fluctuate around 7%. Delays caused by air traffic control do not change significantly either and they reach approx. 9%. Problems caused by propagation of delays occur most often in July; see Tab. 13. In July, there is slightly higher delay probability due to overloaded airports.
The correspondence map shows that propagation of delays is situated in the middle of the graph and is approximately equally distant from all time periods. This is the most frequent delay reason, and it does not change significantly during the period under review. The results are similar to those in the contingency table; it is therefore evident that the delay caused by operational reasons of the airline prevails in the month of September. Delays caused by technical maintenance or aircraft defect are most frequent in July; see Fig. 10.
Tab. 14: Contingency In Prague, delays occur least often in July. In Bratislava, delays occur most often in July. Generally, we can say that the differences in delays at individual airports do not change significantly during the reference period; see Tab. 14.   The graph is an indication of the fact that at the airports in Ostrava and Bratislava delays arise often in July when the air traffic intensity is reaching its peak. Conversely, in Prague the delays are least frequent in July; see Fig. 11.
Column relative frequencies prove that delays caused by operational reasons prevail during the day and occur very rarely at night. In daytime, problems tend to occur e.g. during aircraft transfers in the ramp area. In night hours there are generally fewer flights and therefore also fewer problems e.g. with transiting passengers. It is furthermore evident that delays due to passengers and their baggage caused during aircraft handling by suppliers are very rare during the day. Delays caused by technical maintenance or aircraft defect and delays caused by operational requirements and crew duty norms are most common at night. Delays caused by the air traffic control most often happen at night at 00:01-06:00 am and also in the evening at 06:01-12:00 pm. Delays caused by airport restrictions are most frequent at night. This can be attributed to lower number of operational staff at the airports at night. Problems caused by previous flight delays occur least at night and most often in the afternoon. Fewer planes generally fly at night; therefore, this delay cause is rather scarce. Most flights take place in the afternoon hours, which is probably why this cause is the most common in the afternoon hours; see Tab. 15. The analysis confirms the dependence of delay length on the time of the day when most airplanes take off from their home airports in the morning and get delayed only due to the following flights in the course of the day. The correspondence map shows that in the daytime delays most frequently appear due to operational reasons of the airline and due to the propagation of delays; see Fig. 12. Short delays are least often caused by operational reasons of the airline, technical maintenance or aircraft defects, and delayed previous flights. Other causes are very frequent. Regarding longer delays (under one hour), the least common causes are problems caused by suppliers. On the other hand, delayed previous flights and problems with passengers and their baggage are highly frequent causes. As for the causes of the longest delays (more than one hour) -operational reasons of the airline, technical maintenance or aircraft defect, and delayed previous flights prevail. Problems with passengers and their baggage, problems caused during the aircraft handling by suppliers and destination airport restrictions represent the least common causes of these delays; see Tab. 16.
The correspondence map supports the results of the contingency table: operational reasons of the airline and technical maintenance or aircraft defects predominate with regard to the longest delays (over one hour). At 00:31-01:00 am, delayed previous flights are frequently the reason for another delay; see Fig. 13.

DISCUSSION AND CONCLUSIONS
The principal objective of this paper was to evaluate and assess the delay-caused problems at selected airports in the V4 countries. The cluster analysis paired with internal information from the airline allowed for the selection of the "Base Airports". At first, all flights at the selected airports were taken into consideration, which led to the conclusion that: scheduled flights are delayed (approx.) just as much as chartered flights; delays occur most frequently in June; and Boeing 737-800 reported delays more frequently than Airbus A320. The Kruskal-Walis test allowed for the identification of statistically significant differences between individual categories. Further analysis of selected airports revealed additional interesting facts.
As for the frequency of delays, the airport in Katowice reports the best results, while the highest number of delayed flights occurs at the airport in Budapest. A possible solution of the situation in Budapest could be adding another aircraft to the base. In general, we may conclude that the frequency of delayed flights increases with the size of the city and the airport. Although there are generally fewer delayed flights in Katowice, the delays are often longer than one hour. Our recommendation would be to work on the technical support in Katowice, since long delays arising at this airport are usually caused by technical maintenance or aircraft defects (as demonstrated by the follow-up analysis of individual delay causes). The analysis further revealed advantages at the Brno and Ostrava airports where the technical support runs smoothly. In Prague and Brno, short delays under one hour occur often. Especially in Prague, the situation is satisfactory thanks to the higher number of available aircraft. Short delays under 30 minutes occur most often at night and least often in the evenings. Fewer flights are operated at night and thus there is more time for aircraft maintenance between individual flights. Conversely, long delays over 1.5 hours are frequent in the evenings due to high intensity of air traffic. Most aircrafts take off from their home airports in the mornings therefore only short delays under 30 minutes often occur at night and in the afternoon. During the day delays tend to propagate, as shown in the analysis of delay causes by daytime, while apparently problems caused by delayed previous flights occur least frequently at night and most frequently in the afternoon. An analysis of the propagation of delays to subsequent flights is provided in the article by Campanelli et al. (2014) which focuses on the airline systems behaving in a nonlinear way that is difficult to predict. Models for delay prediction in air transport are introduced as well in an article by Rebollo and Balakrishnan (2014).
Delays caused due to operational reasons of the airline dominate significantly at the Prague airport, as they are rare elsewhere. In Prague there are many connecting flights operated by the airline under review and it is often necessary to wait for transiting passengers. Delays caused by air traffic control and airport restrictions are more often reported from the airports in Warsaw, Prague, and Budapest due to significant traffic intensity. Problems caused by delayed previous flights generally occur more often at the airports in Brno, Ostrava, Budapest, and Bratislava, where there is no aircraft available to optimize possible delays.
Delays caused by technical maintenance or aircraft defects are most frequent in July, as there are generally most flights in July, and that is when technical problems are encountered more often, and more frequent maintenance is necessary. Propagation of delays occurs most frequently again in July. Delays caused by operational reasons of the airline are frequent during the day. There is generally a high number of flights during the day and there are problems e.g. with transiting passengers and aircraft transfers in the ramp area. Delays caused by technical maintenance or aircraft defects and delays caused by operational requirement and crew duty norms are most common at night. Service inspections are usually done at night as there is lower flight demand which allows time and space for more demanding service operations. If there is a crew member absence, it is difficult to find a replacement at night.
Optimization of delays emerging due to aircraft and crew scheduling has been addressed by AhmadBeygi et al. (2008). Delays caused by air traffic control occur most often at night and also in the evening. The reason behind this may be the fact that the times 04:00-06:00 am and 06:00-09:00 pm are the busiest for the airport airspace capacities. Flight optimization options of the air traffic control have been covered in the works of Wu et al. (2016) and Belkoura et al. (2016). Delays caused by airport restrictions appear most frequently at night when airport staff is limited.
Our analysis has proven that delays triggered in association with passengers and their baggage are not a common problem at the airports under review; the articles by Huang et al. (2016) and Abdelghany et al. (2006) deal with the question of how to solve the possible problems in this area. Zámková et al. (2017) proved that the most frequent cause of delay is the propagation of delays, which tends to increase during the day. According to our analysis of home base V4 airports, the longest delays (two hours or more) occur at night and again, the delay propagation is to blame. Furthermore, both charter and scheduled flights apparently have the same percentage of delayed flights, delays occur most frequently in June, and Boeing 737-800 reports delays more frequently than Airbus A320. The longest delays (over 2 hours) were reported from Katowice and Budapest.
All tested dependences have appeared to be statistically dependent (p-value under 0.001). The findings of this research have been consulted with an expert working in the aircraft company.
The majority of our findings may be generalised and applied to smaller airlines operating at the airports of the Visegrad Group. However, airlines today have completely different concerns, considering the consequences of the ongoing Covid-19 pandemic. Still, it is our belief, that everything will be back to normal soon before long and the travel industry will return to its pre-covid state. When this happens, airlines will once again strive to eliminate flight delays, and this study may provide some useful insights, helping with the adoption of strategic measures to curb the number and length of their delays.