A Functional FMECA Approach for the Assessment of Critical Infrastructure Resilience
2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)
HERB+: Evolving an Industrial-Strength Privacy-Preserving Machine Learning Framework
2022 IEEE 27th Pacific Rim International Symposium on Dependable Computing (PRDC)
Driving Profile using Evolutionary Computation
2019 IEEE Congress on Evolutionary Computation (CEC)
Road injuries are among the top ten causes of death worldwide. It has been shown that providing f... more Road injuries are among the top ten causes of death worldwide. It has been shown that providing feedback to drivers decreases the likeliness of having them engaging into dangerous manoeuvres, such as speeding. It also contributes to reduce the amount of life-threatening incidents related with braking. Due to its ubiquity, smartphones are a great resource for assessing driving behaviour. Several mobile applications have been created with this purpose, but there is no concrete evidence that these approaches offer consistent results over distinct platforms (Operating Systems) and hardware. Providing a model for assessing driver behaviour across distinct devices represents a major challenge, due to the increasing differentiation between platforms and mobile devices’ internal sensors (gyroscope, accelerometer, GPS, and magnetometer.) In this study we propose the application of Evolutionary Computation techniques to create models for driving behaviour characterisation over data acquired from mobile devices with distinct sensors. Our experiments show that we are able to evolve models that are robust and can accurately identify the legs of a car journey that have abnormal events. In concrete we are able to evolve predictive models that can successfully create a profile about the driving behaviour of a person.
A holistic data modeling approach for multi-database systems
2021 IEEE International Conference on Big Data (Big Data), 2021
IoT, edge-oriented systems, and the growing ubiquity of access to the Internet have driven the de... more IoT, edge-oriented systems, and the growing ubiquity of access to the Internet have driven the development of the most complex software systems to date. Designing such systems is demanding due to their distributed nature, different technologies, multi-layer, hard-to-meet quality attributes, and the integration of several databases with diverse technologies. This work proposes a data modeling method able to represent holistically these systems’ data structure, data transport, and transformation.
Performance Evaluation and Benchmarking for the Era of Cloud(s), 2020
I would first like to thank my thesis advisors Prof. Bruno Cabral and Prof. Jorge Bernardino for ... more I would first like to thank my thesis advisors Prof. Bruno Cabral and Prof. Jorge Bernardino for their patience and perseverance. Also, I want to thank all my friends for cheering for me. Finally, I want to thank my parents and sister for having shaped my values and believes towards the person I am today.
Proceedings of the 8th International Joint Conference on Computational Intelligence, 2016
DNA discovery has put humans one step closer to deciphering their own structure stored as biologi... more DNA discovery has put humans one step closer to deciphering their own structure stored as biological data. Such data could provide us with a huge amount of information, necessary for studying ourselves and learn all the variants that predetermine one's characteristics. Although, these days, we are able to extract DNA from our cells and transform it into sequences, there is still a long road ahead since DNA has not been easy to process or even extract in one go. Over the past years, bioinformatics has been evolving more and more, constantly aiding biologists on the attempts to "break" the code. In this paper, we present some of the most relevant algorithms and principles applied on the analysis of our DNA. We attempt to provide basic genome overview but, moreover, the focus of our study is on assembly, one of the main phases of DNA analysis.
Simulators allow for the simulation of real-world environments that would otherwise be financiall... more Simulators allow for the simulation of real-world environments that would otherwise be financially costly and difficult to implement at a technical level. Thus, a simulation environment facilitates the implementation and development of use cases, rendering such development cost-effective and faster, and it can be used in several scenarios. There are some works about simulation environments in Edge Computing (EC), but there is a gap of studies that state the validity of these simulators. This paper compares the execution of the EdgeBench benchmark in a real-world environment and in a simulation environment using FogComputingSim, an EC simulator. Overall, the simulated environment was 0.2% faster than the real world, thus allowing for us to state that we can trust EC simulations, and to conclude that it is possible to implement and validate proofs of concept with FogComputingSim.
Proceedings of the 2nd International Conference on Complexity, Future Information Systems and Risk, 2017
Safety-critical systems have to continuously manage risks, in order to handle hazardous situation... more Safety-critical systems have to continuously manage risks, in order to handle hazardous situations and still be able to fulfil their purpose. While being composed by a variety of software, as well as hardware components, it is necessary for each part of these systems, alone and as a whole, to exhibit a required set of characteristics, necessary to ensure the correct system functioning. Complex Event Processing (CEP) systems have been used in a diversity of applications and, while they focus on fast data gathering and processing as well as in providing intelligence to their users, there is incomplete information about how they are adequate to integrate safety-critical systems. In this paper we investigate if the mainstream off-the-shelf CEP systems are suitable for safety-critical applications. We describe the use of complex event processing engines in safety-critical systems and how some authors enhance those to better correspond to the critical system requirements. We demonstrate that, although dependability is well handled in most CEP systems, the same cannot be assumed about security and safety attributes.
There are billions of lines of sequential code inside nowadays' software which do not benefit... more There are billions of lines of sequential code inside nowadays' software which do not benefit from the parallelism available in modern multicore architectures. Automatically parallelizing sequential code, to promote an efficient use of the available parallelism, has been a research goal for some time now. This work proposes a new approach for achieving such goal. We created a new parallelizing compiler that analyses the read and write instructions, and control-flow modifications in programs to identify a set of dependencies between the instructions in the program. Afterwards, the compiler, based on the generated dependencies graph, rewrites and organizes the program in a task-oriented structure. Parallel tasks are composed by instructions that cannot be executed in parallel. A work-stealing-based parallel runtime is responsible for scheduling and managing the granularity of the generated tasks. Furthermore, a compile-time granularity control mechanism also avoids creating unnece...
Today data-intensive systems, such as e-business. eprocurement e-government, e-commerce etc. syst... more Today data-intensive systems, such as e-business. eprocurement e-government, e-commerce etc. systems present a huge amount of available data. One of the main issues of query processing is how to process queries efficiently. In many cases, it is impossible or too expensive for users to get exact answers in a short query response time. Approximate query processing (AQP) is an alternative way that returns approximate answer and is increasingly used, as millions of data are processed daily in a database. In this paper, we evaluate two of the latest AQP systems with the best results in the literature: VerdictDB and XDB. We test these systems according to the query response time and accuracy of results returned, using queries of the TPC-H benchmark with different sizes. The VerdictDB and XDB are good tools for large volumes of data. The experiments demonstrate that VerdictDB results can be 76x faster than MySQL. However, with the same query response time, XDB returns results with more accuracy. Keywords-Approximate query processing, error rate, query response time.
Conceptual modeling describes the physical or social aspects of the world abstractly, encompassin... more Conceptual modeling describes the physical or social aspects of the world abstractly, encompassing the interpretation of data production, gathering, visualization, and analysis. The quality of the data analysis system will limit the excellence of any decision-making process. Thus, accurately specifying the physical data model is essential. The primary goal of our work is to compare tools that can create this physical model. We recognize several types of data models, but we only include the relational data model. We evaluate free and commercial data modeling tools. But it is challenging to decide how to compare them and which elements are crucial. We propose a new approach for software tools' evaluation based on the Business Readiness Rating (BRR) model and the OSSpal evaluation methodology. In this work, we show that this new methodology can be tailored to the needs of each individual developer or team, thus providing proper and meaningful results. Also, by applying this hybrid approach to the evaluation of data modelling tools, we show it can robustly handle the bias from lesser relevant evaluation categories.
Language-Based Expression of Reliability and Parallelism for Low-Power Computing
IEEE Transactions on Sustainable Computing, 2018
Improving the energy-efficiency of computing systems while ensuring reliability is a challenge in... more Improving the energy-efficiency of computing systems while ensuring reliability is a challenge in all domains, ranging from low-power embedded devices to large-scale servers. In this context, a key issue is that many techniques aiming to reduce power consumption negatively affect reliability, while fault tolerance techniques require computation or state redundancy that increases power consumption, thereby leading to systematic tradeoffs. Managing these tradeoffs requires a combination of techniques involving both the hardware and the software, as it is impractical to focus on a single component or level of the system to reach adequate power consumption and reliability. In this paper, we adopt a language-based approach to express reliability and parallelism, in which programs remain adaptable after compilation and may be executed with different strategies concerning reliability and energy consumption. We implement the proposed programming model, which is named <inline-formula><tex-math notation="LaTeX">$\mathtt {MISO}$</tex-math> <alternatives><inline-graphic xlink:href="fonseca-ieq1-2771376.gif"/></alternatives></inline-formula>, and perform an experimental analysis aiming to improve the reliability of programs, through fault injection experiments conducted at compile-time, as well as an experimental measurement of power consumption. The results obtained indicate that it is feasible to write programs that remain adaptable after compilation in order to improve the ability to balance reliability, power, and performance.
Safety-critical systems have to continuously manage risks, in order to handle hazardous situation... more Safety-critical systems have to continuously manage risks, in order to handle hazardous situations and still be able to fulfil their purpose. While being composed by a variety of software, as well as hardware components, it is necessary for each part of these systems, alone and as a whole, to exhibit a required set of characteristics, necessary to ensure the correct system functioning. Complex Event Processing (CEP) systems have been used in a diversity of applications and, while they focus on fast data gathering and processing as well as in providing intelligence to their users, there is incomplete information about how they are adequate to integrate safety-critical systems. In this paper we investigate if the mainstream off-the-shelf CEP systems are suitable for safety-critical applications. We describe the use of complex event processing engines in safety-critical systems and how some authors enhance those to better correspond to the critical system requirements. We demonstrate t...
Parallel programs have the potential of executing several times faster than sequential programs. ... more Parallel programs have the potential of executing several times faster than sequential programs. However, in order to achieve its potential, several aspects of the execution have to be parameterized, such as the number of threads, task granularity, etc. This work studies the task granularity of regular and irregular parallel programs on symmetrical multicore machines. Task granularity is how many parallel tasks are created to perform a certain computation. If the granularity is too coarse, there might not be enough parallelism to occupy all processors. But if granularity is too fine, a large percentage of the execution time may be spent context switching between tasks, and not performing useful work.
The emergence of exception handling mechanisms in modern programming languages made available a d... more The emergence of exception handling mechanisms in modern programming languages made available a different way of communicating errors between procedures. For years, programmers trusted in the correct documentation for error codes returned by procedures to correctly handle erroneous situations. Now, they have to focus on the documentation of exceptions for the same effect. But to which extent can exception documentation be trusted? Moreover, is there enough documentation for exceptions? And in which way do these questions relate to the checked vs. unchecked exceptions discussion? For a given set of Microsoft .NET applications, code and documentation were thoroughly parsed and compared. This showed that exception documentation tends to be scarce and of poor quality when existent. In particular, it showed that 90% of exceptions are undocumented. Furthermore, programmers were demonstrated to be keener to document exceptions they explicitly throw while typically leaving exceptions result...
Most modern programming languages rely on exceptions for dealing with errors. Although exception ... more Most modern programming languages rely on exceptions for dealing with errors. Although exception handling was a significant improvement over other mechanisms like checking return codes, it's far from perfect. In fact, it can be argued that this mechanism is seriously flawed. In this paper we argue that exception handling should be automatically done at the runtime/operating system level. The motivation is similar to the one that lead to garbage collection: memory management was a tedious and error prone process, thus virtual machines included support for taking care of it. We believe that many exceptions can be automatically dealt with, and recovered, as long as appropriate mechanisms exist in the runtime environment. We believe that this approach may dramatically influence the way programming languages are designed and significantly contribute to having more robust code, being actually developed with much less programming effort.
Designing a Neural Network from Scratch for Big Data Powered by Multi-node GPUs
Lately, Machine Learning has taken a crucial role in the society in different vertical sectors. F... more Lately, Machine Learning has taken a crucial role in the society in different vertical sectors. For complex problems with high-dimensionality, Deep Learning has become an efficient solution for learning in the context of supervisioned learning. Deep Learning [1] consists in using Artificial Neural Networks (ANN or NN) with several hidden layers, typically also with a large number of nodes in each layer.
When Two Are Better Than One: Synthesizing Heavily Unbalanced Data
IEEE Access, 2021
Nowadays, data is king and if treated and used properly it promises to give organizations a compe... more Nowadays, data is king and if treated and used properly it promises to give organizations a competitive edge over rivals by enabling them to develop and design Intelligent Systems to improve their services. However, they need to fully comply with not only ethical but also regulatory obligations, where, e.g., privacy (strictly) needs to be respected when using or sharing data, thus protecting both the interests of users and organizations. Fraud Detection systems are examples of such systems where Machine Learning algorithms leverage information to classify financial transactions as legitimate or illicit. The data used to create these solutions is usually highly structured and contains categorical and continuous features characterised by complex distributions. One of the main challenges of fraud detection is concerned with the scarcity of fraudulent instances which results in highly unbalanced datasets. Additionally, privacy is crucial, and it is usually forbidden, or not possible, to share the data of organizations and individuals for creating or improving models.In this paper we propose a framework for private data sharing based on synthetic data generation using Generative Adversarial Networks (GAN) that learns the specificities of financial transactions data and generates fictitious data that keeps the utility of the original datasets. Our proposal, called Duo-GAN, uses two GAN generators to handle the data imbalance problem, one generator for fraudulent instances and the other for legitimate instances. With this approach, we observed, at most, a 5% disparity in F1 scores between classifiers trained and tested with actual data and the ones trained with synthetic data and tested with actual data.
2019 15th International Conference on Distributed Computing in Sensor Systems (DCOSS), 2019
The aim of this study is to study the Antecedents of perceived risks of tourists coming to Egypt ... more The aim of this study is to study the Antecedents of perceived risks of tourists coming to Egypt and how they affect their behavior in searching for information before deciding to travel to Egypt. Researches related to antecedents of perceived risks in tourism literature was reviewed. This research focuses on knowledge aspects of tourists, because it is related to information search process, which represented in previous experiences in international travel, previous visits to Egyptian destination, personal knowledge and objective knowledge. Effects of perceived risks as a mediator between these antecedents and information search was examined. The data were collected using survey lists of 520 tourists who visited Egypt at least once during their stay in Egypt. Using the SEM, the results found that while objective knowledge did not significantly reduce or increase the risks associated with travel to Egypt, personal knowledge had the strongest effect on perceived risks. The results also indicate that while different dimensions of perceived risks affect the use of different sources of information, previous knowledge also plays a role alongside perceived risks in identifying the sources of information used.
Uploads
Papers by Bruno Cabral