A Flexible Framework for Real-Time Thermal-Aware Schedulers using Timed Continuous Petri Nets

Desirena Lopez, Gaddiel; Rubio Anguiano, Laura Elena; Ramírez Treviño, Antonio; Briz Velasco, José Luis; Desirena Lopez, Gaddiel; Rubio Anguiano, Laura Elena; Ramírez Treviño, Antonio; Briz Velasco, José Luis

doi:10.13053/cys-23-2-3204

Servicios Personalizados

Revista

Articulo

Indicadores

Citado por SciELO
Accesos

Links relacionados

Similares en SciELO

Otros
Otros

Permalink

Computación y Sistemas

versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546

Comp. y Sist. vol.23 no.2 Ciudad de México abr./jun. 2019 Epub 10-Mar-2021

https://doi.org/10.13053/cys-23-2-3204

Articles of the thematic issue

A Flexible Framework for Real-Time Thermal-Aware Schedulers using Timed Continuous Petri Nets

Gaddiel Desirena Lopez¹^*

Laura Elena Rubio Anguiano¹

Antonio Ramírez Treviño¹

José Luis Briz Velasco²

^¹ CINVESTAV-IPN Unidad Guadalajara, Mexico. gdesirena@gdl.cinvestav.mx, lerubio@gdl.cinvestav.mx, art@gdl.cinvestav.mx

^² Universidad de Zaragoza, Spain. briz@unizar.es

Abstract:

This work presents TCPN-ThermalSim, a software tool for testing Real-Time Thermal-Aware Schedulers^¹. This framework consists of four main modules. The first one helps the user to define the problem: task set with periods, deadlines and worst case execution times in CPU cycles, along with the CPU characteristics, temperature and energy consumption. The second module is the Kernel simulation, which builds up a global simulation model according to the configuration module. In the third module, the user selects the scheduler algorithm. Finally the last module allows the execution of the simulation and present the results. The framework encompasses two modes: manual and automatic. In manual mode the simulator uses the task set data provided in the first section. In automatic mode the task set is generated by parameterizing the integrated UUniFast algorithm.

Keywords: Scheduling; simulator; Petri nets

1 Introduction

Many modern embedded real-time (RT) systems can benefit from today's powerful System-on-Chip multicores (MPSoCs). However, RT task scheduling on multiprocessors is far more challenging than traditional RT scheduling on uniprocessors. Although it is possible to find practical solutions for specific cases, considering thermal restrictions, optimizing energy or dealing with resource sharing are still open questions ^([⁶^]). We have been exploring the design of Thermal-Aware RT Schedulers in recent contributions, leveraging control techniques and combining fluid and combinatorial schedulers, taking the most of both approaches ^([¹⁵^]).

On the one hand, fluid schedulers avoid the NP-completeness of a solely combinatorial approach, and allow the use of continuous controllers which make easier to cope with disturbance rejection or to adapt to small parameter variations. On the other hand, the combinatorial approach better matches the nature of the problem, discretizing the fluid schedules to avoid large migrations and context switches.

Testing these kind of schedulers or experimenting with their variations by implementing them on a real system, even on specific evaluation platforms like Litmus-RT ^([⁵^]), can be overkilling if we are just performing preliminary explorations of the design space. For this reason, herein we are presenting a novel flexible simulation framework for real time thermal-aware schedulers: TCPN-ThermalSim. It is composed of four main modules, which depend on certain submodules to work.

The first module, which is referred as the configuration module, allows us to introduce the task set according to the three-parameter task model ^([⁷^]), or use the UUnifast submodule to generate automatically the task set. It also permits to describe the platform used to execute the task set, composed by the number of CPUs, CPU frequencies and its thermal parameters. The second module is the Kernel simulation, which works over three submodules, where each submodule generates a Timed Continuous Petri Net (TCPN) model that will be merged in order to build a global simulation model. The third module focuses on the scheduler definition, it allows the addition of scheduler algorithms to its analysis. And the last module correspond to the execution of the simulation and present the results as graphs or data in the workspace.

This framework includes out-of-the-box the task, CPU usage and thermal general purpose models reported in ^[²²^]. We have included three schedulers: a global EDF scheduler from ^[¹^], and other two from our own authorship a RT fluid scheduler ^([¹⁵^]) and a thermal aware RT scheduler ^([¹⁶^]), however the user can define his own scheduler. Sec. 2 provides some background on multiprocessor RT sheduling and Timed Continuous Petri Nets (TCPN), since this formal tool will be used to model the task set, CPU usage and thermal aspects of task execution. Sec. 3 describes the simulation framework architecture and Sec. 4 the user interface. The schedulers included in the framework are described in Sec. 5. Sec. 6 shows a usage example with results, and Sec. 7 presents some conclusions and future work.

2 Background

There are three well-known avenues to leverage multiprocessors for RT scheduling. The partitioned approaches resort to statically allocating tasks to processors. This results into a simple schedulability analysis, since they can use results from RT uniprocessor scheduling. The downside is that this encompasses an NP-hard bin-packing problem and as a consequence, assuring schedulability imposes a maximum CPU utilization bound of 50% or less ^([²¹^]).

Global scheduling gets around the problem by dynamically allocating jobs (the periodic tasks' instances) to processors, achieving a 100% utilization bound. Thus, Pfair-based algorithms allocates a new task for execution every time quantum ^([²⁶^]), which requires that all task parameters are multiples of such a time quantum. The principal inconvenience is that this approach triggers an unfeasible number of preemptions and migrations. By this reason, Deadline partitioning does only take scheduling decisions on the set of all deadlines of all tasks in the system, achieving optimality with fewer preemptions than Pfair^([¹⁸^]).

A third approach mixes static and dynamic allocation. Thus, clustered scheduling statically allocate tasks to clusters of CPUs, but jobs can migrate within their cluster ^([⁸^]). Alternatively, semipartitioned scheduling preforms a preliminary static allocation of some tasks over the whole set of available CPUs, allowing the rest of them to migrate ^([¹⁹^]). Recently, ^[⁶^] and ^[⁹^] leverage this technique to lowering migrations. In order to test the performance of such schedulers and new ones in early stages, the herein proposed simulation framework is a powerful tool. It is capable to stress schedulers under very realistic conditions, avoiding the effort wasted in adapting the schedulers to actual operating systems and detect from early stages the consequences of a bad heating balance.

When a dynamic thermal balance is a requirement, in order to keep the maximum temperature under control, static allocation techniques are far more restrictive than global approaches. This is the reason why we consider global scheduling, with strategies to minimize preemption and migration like deadline partitioning.

Over the years different real time simulation tools have been proposed, including different features, for example, Cheddar ^([²⁵^]) is a real time scheduler simulator written in Ada, it handles the multiprocessor case and provides implementations of scheduling, partitioning and analysis algorithms, but their interface is not very user friendly. Another tool is YARTISS ^([¹⁰^]), it evaluates scheduling algorithms by considering overheads or hardware effects, and its design focuses on energy consumption. On the other hand, SimSo ^([¹¹^]) is a simulation tool that includes different scheduling policies, and takes into account multiple kinds of overheads. All of these tools represent a great aid when developing new algorithms, but none of them is capable of including a thermal model, for this reason we developed TCPN-ThermalSim as a framework capable of developing a thermal analysis of the scheduling algorithms.

Table 1 Software simulation tools comparison

Framework	Programming language	Custom Schedulers	Energy considerations	Thermal considerations
Cheddar	Ada
SimSo	Python
YARTISS	Java
TCPN ThermalSim	MATLAB

Finally, since the proposed framework models tasks and CPUs using Timed Continuous Petri Nets (TCPN), this Section introduces basic definitions concerning Petri nets and continuous Petri nets. An interested reader may also consult ^[¹²^], ^[¹³^], ^[²⁴^] to get a deeper insight in the field.

2.1 Discrete Petri Nets

Definition 2.1 A (discrete) Petri net is the 4-tuple N = (P, T, Pre, Post) where P and T are finite disjoint sets of places and transitions, respectively. Pre and Post are |P| × |T| Pre - and Post-incidence matrices, where Pre(i, j) > 0 (resp. Post(i, j) > 0) if there is an arc going from t_j to p_i (resp. going from p_i to t_j), otherwise Pre(i, j) = 0 (resp. Post(i, j) = 0).

Definition 2.2 A (discrete) Petri net system is the pair Q = (N, M) where N is a Petri net and M:P→N∪0 is the marking function assigning zero or a natural number to each place. The marking is also represented as a column vector M, such that its i - th element is equal to M(p_i ), named the tokens residing into p_i. M₀ denotes the initial marking distribution.

A transition t∈T is said enabled at the marking M∈NP iff M≥Prep,t, the occurrence or firing of an enabled transition leads to a new marking distribution M'∈NP that can be computed by using M'=M+CP,t=M+C∙et where C = Post - Pre is named the incidence matrix, and e_t denotes the t - th elementary vector (e_t (k) = 1 if k = t, otherwise e_t (k) = 0).

2.2 Continuous and Timed Continuous Petri Nets

Definition 2.3 A continuous Petri Net (ContPN) is a pair ContPN = (N, m₀) where N = (P, T, Pre, Post) is a Petri net (PN) and m0∈R+∪0P is the initial marking.

The evolution rule is different from the discrete PN case. In continuous PN's the firing is not restricted to be integer. A transition t_i in a ContPN is enabled at mif∀ pj∈•ti,mpi>0 ; and its enabling degree is defined as enabti, m=minpj∈⋅timpjPrepj,ti . The firing of t_i in a certain positive amount α≤enabti,m leads to a new marking m'=m+αCP,ti, where C = Post - Pre is computed as in the discrete case.

If m is reachable from m₀ by firing the finite sequence σ of enabled transitions, then m=m0+Cσ→ is named the fundamental Eq. where σ→∈R+⋃0T is the firing count vector, i.e σ→tj; is the cumulative amount of firings of t_j in the sequence σ.

Definition 2.4 A timed continuous Petri net (TCPN) is a time-driven continuous-state system described by the tuple (N, λ, m₀) where (N, m₀) is a continuous PN and the vector λ∈R+⋃0Trepresents the transitions rates determining the temporal evolution of the system.

Transitions fire according to certain speed, which generally is a function of the transition rates and the current marking. Such function depends on the semantics associated to the transitions. Under the infinite server semantics ^[²³^] the flow through a transition t_i (the transition firing speed) is defined as the product of its rate, λ_i , and enab(t_i , m ), the instantaneous enabling of the transition, i.e., fim=λienabti,m=λiminpj∈•timpjPrepj,ti (through the rest of this paper, for the sake of simplicity the flow through a transition t_i is denoted as f_i ).

The firing rate matrix is denoted by Λ=diagλ1,…,λT . For the flow to be well defined, every continuous transition must have at least one input place, hence in the following we will assume ∀t∈T,t•≥1 . The "min" in the above definition leads to the concept of configuration.

A configuration of a TCPN at m is a set of arcs (p_i , t_j ) such that p_i provides the minimum ratio mp/Prep,ti; among the places p∈•ti at the given marking m. We say that p_i constrains t_j for each arc (p_i , t_j ) in the configuration. A configuration matrix is defined for each configuration as follows:

Πm=1Prei,jifpiis constraining tj0otherwise. (1)

The flow through the transitions can be written in vectorial format as fm=ΛΠmm . The dynamical behaviour of a PN system is described by its fundamental equation:

m˙=CΛΠmm. (2)

In order to apply a control action to (2), a term u such that 0 ≤ u_i ≤ f_i (m) is added to every transition t_i to indicate that its flow can be reduced. Thus the controlled flow of transition t_i becomes w_i= f_i– u_i. Then, the forced state equation is:

m˙=Cfm-u=Cw0≤ui≤fim. (3)

2.3 System Definition

The task model accepted by this framework follows the three-parameter task model ^([⁷^]), but is extended to include energy consumption parameters. Each periodic real-time task τ_i is described by a quadruplet τ_i : (cc_i , ω_i , d_i , e_i ), where cc_i is the worst-case execution time in cycles, d_i is the deadline, ω_i is the period, and e_i is the task energy consumption.

The set of periodic tasks T=τ1,…,τn are executed on a set of identical processors P=CPU1,…,CPUm with an homogeneous clock frequency F∈F=F1,…,Fmax . The hyper-period is defined as the period equal to the least common multiple of periods H=lcmω1,w2,…,ωn of the n periodic tasks.

A task τ_i executed on a processor at frequency F, requires ci=cciF processor time at every ω_i interval. The system utilization is defined as the fraction of time during which the processor is busy running the task set i.e., U=∑i=1nciωi. Herein we consider that the execution of real-time tasks in the system is preemptable.

3 Framework Architecture

The simulation framework has been programmed in MATLAB R2018a© ^([²⁰^]) . It is distributed as open source software as-is ^([¹⁴^]). Its modular design provides flexibility to test a wide variety of schedulers and platforms. It includes a signal routing interface allowing switching among different user-defined scheduling algorithms.

This framework makes easier to evaluate a large number of different scenarios, where the platform (hardware), the set of tasks, and schedulers can be defined by the user through a Graphical User Interface (GUI).

Fig. 1 shows the main modules of the framework: Configuration, UUnifast, Kernel Simulation, Scheduler and Results. First, the user introduces the set of tasks, platform, and the scheduler in the configuration module. After the completion of the configuration stage, the simulation is executed. Later the results are presented to the user. The following subsections describe these modules.

Fig. 1 Framework Architecture

Fig. 2 Kernel of the simulator

3.1 Configuration Module

This module allows the introduction of all the information required by the framework. It is organized in four sections: a) Task definition, b) CPU definition, c) Thermal definition, and d) Scheduler definition. The order in which the information is introduced is irrelevant. The user can resort to default values or turn-off some sections, like the thermal definition.

a) The Task definition section allows two different ways to introduce the information. One way is to manually enter the number of tasks along with their parameters. Another way is to let the algorithm UUniFast ^([⁴^]) automatically generate a task set with the desired characteristics.
b) The CPU definition section requires two parameters: the number of CPUs and their frequency scale. The frequency scale is a set of normalized frequencies at which the platform could operate, where 1 indicates the highest frequency. The framework assumes homogeneous CPUs, a feature that will be relaxed in future releases.
c) The thermal definition section requires the Printed Circuit Board (PCB) and CPU dimensions as in Fig( 3). Also requires the isotropic thermal properties: density, specif heat capacity, thermal conductivity coefficient, as well as the ambient temperature and the maximum operating temperature.
d) The Scheduler definition section is generic. The user either, can select one from a set of pre-programmed schedulers, or can define his/her own scheduler.

Fig. 3 PCB reference for thermal definition

The user should consider signals like CPU temperature, system utilization, energy consumption, (or a subset of them) to design her/his own scheduler. At every time step, the section generates a matrix of size Number of tasks × Number of CPUs, where the ij - th entry represents the allocation of the i - th task to the j - th CPU.

3.2 UUniFast Submodule

The user can opt for the UUniFast algorithm ^([⁴^]) to generate the task set in the configuration stage, indicating the number of tasks to be generated, the system utilization U and a range for the task periods. The output is a feasible real-time task set with random task periods, WCETs, deadlines and consumed energy. UUniFast generates one set of tasks at a time. It allows to stress the scheduler under analysis with different set of tasks.

3.3 Kernel Simulation Module

The Kernel module builds up a global simulation model according to the task set, CPUs, thermal and energy parameters and the selected scheduler, and runs the simulation.

The model represents task, CPU and thermal modules by a set of ordinary differential equations, and generates the signals to/from the scheduler. The scheduler can be represented either as a continuous or a discrete system. Accordingly, the scheduler can be modelled by the paradigm of differential equations or finite automata.

Next subsections describe how to build module's models. Later, we explain how the modules are merged into a global model.

3.3.1 Task Arrival and CPU's Submodule

The TCPN model representing the Task arrival and CPU's Module (Fig. 4) was first introduced in ^[¹⁷^] and evaluated in ^[¹⁶^]. Here we present a brief explanation of the TCPN model and the differential equations that represent the behaviour. The task module is composed by places piω, picc and pid and transitions tiω. Places piω, picc and pid represent that task τ_i belongs to the set of tasks, the CPU cycles of task τ_i that are arriving to the system, and the deadline of τ_i , respectively. λiω=1ωi represents the arriving rate of task τ_i.

Fig. 4 Task and CPU TCPN module

The CPU module is composed of places pi,jbusy, and pjidle, and transitions ti,jalloc, and ti,jexec. The marking in place pi,jbusy represents the amount of task τ_i that was allocated to CPU_j. The marking in place pjidle represents that CPU_j is idle. The firing of transition ti,jalloc represents that task τ_i is being allocated to CPU_j , and the firing of transition ti,jexec represents that task τ_i is being executed by CPU_j.

The differential Eqs. (4) and (5) representing the behaviour of the TCPN in Fig. (4) can be derived considering the following four vectors as the marking and transition vectors, respectively, of the task module, and the marking and transition vectors, respectively, of the CPU module:

mT=mTp1w,mTp1cc,mTp1d,…,mTpnw,mTpncc,mTpndT,

TT=t1w,…,tnwT,

and

mp=mpp1,1busy,mpp1idle,…,mppn,mbusy,mppmidleT,

TT=t1,1alloc,t1,1exec,…,tn,malloc,tn,mexecT.

Eq. (4) models task arrival. mτ describes how task are arriving to the system over time, and ωalloc=ω1alloc1,…ωnalloc1,…,ωnallocm represents the allocation of tasks to CPUs. And Eq. (5) describes how tasks are allocated to CPUs, where m _P represents the reservation of CPUs to the allocated tasks. It is very important to realize that signal w ^alloc must be computed by the scheduler. This signal is an input to the system and indicates when a task must be allocated to a CPU:

m˙T=CTΛTΠTmmT-CTallocwalloc, (4)

m˙p=CpΛpΠpmmp-Cpallocwalloc. (5)

3.3.2 Task Execution Module

The task execution module is represented by places pi,jexec depicted in Fig. 4. Considering the vector:

mexec=mexecp1,1exec,…,mexecpn,mexecT,

representing the marking of the places of the task execution module (for easily representation mi,jexec=mexecp1,1exec), then the task execution behavior is represented by Eq. (6):

m˙exec=ApmAp, (6)

where A _P is built from CpΛpΠpmmp considering only rows corresponding to places pi,jbusy and columns corresponding to transitions ti,jexec; other rows and columns are discarded. Marking mAp considers the marking in places pi,jbusy; other markings are discarded.

The marking of these places represent the amount of task τ_i that is executed by CPU_j. This signal is available for any scheduler.

3.3.3 Thermal Module

This work considers the thermal model presented in ^[¹⁷^], the model was evaluated under comparison with simulations in ASYS®. This model rewrites the thermal partial differential equation by a set of ordinary thermal differential equations. It is as precise as a Finite Element approach and has the advantage that a state model is derived from the analysis. It also avoids the calibration stages of RC thermal approaches, and only requires the isotropic thermal properties of the materials: density, specif heat capacity and thermal conductivity coefficient.

In this section we present a brief explanation, for a deeper insight please refer to ^[¹⁷^] and ^[¹⁶^]. The thermal module is composed of several thermal submodules, representing thermal conduction, convection and heat generation. The arc from transition ti,jexec to place picom1 represents heat generation due to task execution.

Fig. 5 depicts the thermal module. Following the numbering of places and transitions, the temperature behaviour of the system is represented by Eq. (7a):

m˙T=CTΛTΠTmmT+CaΛaΠamma+Cpexecfexec, (7a)

m˙a=0. (7b)

Fig. 5 Thermal module

mT is the distribution of temperature over the system elements, ma is the ambient temperature and it is considered constant, and fexec is a variable depending on task allocation and the frequency at which CPUs are executing tasks. CPU temperature is available for scheduling purposes. Temperature depends on the task execution and frequency. In steady state, this is tantamount to say that temperature depends on task allocation ωiallocj and frequency. As mentioned above, these parameters are coded in fexec

At each simulation step, the Thermal model reads the new system state and the ambient temperature to compute the new CPU temperature. The main output signals of the Kernel simulator are the task execution vector ( mexec) and the CPUs temperature ( mT) at each simulation step. The input (output) signals are received (sent) to the Scheduler module in order to obtain a feedback signal for the next step.

3.4 Scheduler Module

The scheduler module allows to select, at configuration time, one of the scheduling policies available in the framework, or any scheduler defined by the user. The TCPN The signals available to the scheduler from other modules are mi,jexec representing the amount of τ_i executed by CPU_j , and mTj the CPU_j temperature. The signals that other modules require from the scheduler are the task allocations signals ωiallocj.

The scheduling policies available in the framework are an RT Global Earliest Deadline First (G-EDF) ^([¹^]), an RT fluid scheduler (RT-TCPN)^([¹⁵^]), and an RT thermal-aware fluid scheduler (RT-TCPN Thermal aware) for Dynamic Priority Systems (DPS) ^([¹⁶^])). The user can define a new scheduler as a continuous or discrete scheduler (Sec.5).

3.4.1 Custom Scheduler

This section describes how to implement a custom schedule. Algorithm 1 provides the executive cycle of the simulator. It includes the scheduler (user defined or pre-programmed one) in line 4.

The main input signals, from the simulation environment, that the user might use, are the CPU temperature, executed tasks, current CPU frequency and consumed energy, also the secondary signals mbusy (the CPU state), and all task and CPU parameters are available for scheduling purposes. The output signals that the scheduler module must deliver to the simulation engine are the ωalloc and frequency. It is not needed that all the input/output signals be used by the scheduler; the only mandatory signal that the scheduler must deliver to the system is ωalloc. For presentation purposes, this subsection renames the variables x1=YT CPU temperature, x2=mexec executed tasks, x3=F current CPU frequency and also the secondary signals x4=mbusy.

If the user describes the scheduler as a continuous one, then he/she must write the signal walloc as:

w˙alloc=∑iAixi.

If the user describes the scheduler as a discrete one, then he/she must write the signal ωalloc as:

wallocζ=Φx1,..,x4.

In both cases, the functions represented by the matrices A_i or function ɸ are computed based on the scheduler objectives and task and CPU parameters, as well as the current state of the system. It is the user responsibility the design of such a functions.

Notice that the simulator is time driven, thus, although in the discrete case the ωalloc signal depends on the event arrival, it is represented as a function of time the make it compatible with the whole simulation.

Finally, once the scheduler is defined, it is integrated to the simulation engine and used at each simulation step.

3.5 Building the Global Model

A full system in the simulation framework consists of a set of tasks, a set of CPUs on a platform, and a scheduler (as an input), represented by separated models. In order to simulate a full system, these models must be gathered into a global model. In the case of a continuous scheduler, the global model is simulated by solving the system:

M˙=AM+Bwalloc+B'ma,Y=SM, (8)

where M=mT,mP,mTT , and the matrices are:

A=CTΛTΠT000CpΛpΠp00CpexecΛexecΠexecCTΛTΠT, (9)

B=CTallocCpalloc0B'=00CaΛaΠa, (10)

S=0000Ap000ST. (11)

Ap correspond to the output matrix for task execution and ST represents the temperature output matrix. Thus the output vector Y=mexec,mT contains the task execution and the temperature of each processor. If the thermal module is not selected for simulation the global model is slightly different: vector M will only contain M=mT,mP every matrix (Eq.9 - 11) will lose its last row, and Eq.(9) will also lose its last column, and the output vector Y=mexec will only contain the task execution.

3.6 Results Module

After a simulation run, the Results module generates plots showing the allocation and execution of jobs to CPUs and the CPUs temperature evolution, as in Fig. 10 and Fig. 11, by using the tools and functions contained in MATLAB R2018a®(ie. Heat maps).

4 User Interface

Fig. 6 shows the main window of the simulator GUI. There are three areas, which contain information about Tasks parameters, Processors parameters and Thermal parameters.

Fig. 6 Main tab GUI. a) Manual mode, b) UUnifast mode

The user can manually define the parameters of the tasks set in the area entitled Tasks parameters. The parameters to be defined are the number of tasks, and the value of the WCET, task period and task energy consumption per each task (see Fig. 6a)). Alternatively, a random set of tasks can be configured by checking the UUniFast check box (Fig. 6b)), setting the number of tasks, the utilization of the task set and the interval of periods, and then clicking the button Generate task set.

The area entitled Processors parameters allows to set the number of CPUS and the homogeneous clock frequency. Last, the area Thermal parameters consists of three subareas. The first two correspond to the geometry (length X, height Y, and width Z measurements) and material properties (thermal conductivity k coefficients, density ρ, specific heat capacities C_p ) of the board and CPUs.

The third subarea is for entering the mesh geometry (Mesh step) and the accuracy (dt) of the solutions obtained by the TCPN Thermal model ^([¹⁷^]). Once these parameters have been set, the next step is to select an scheduling policy by clicking the button Select Scheduler in the Scheduler selection tab (Fig. 7). The available scheduling policies appear in a selection list. User-defined scheduling policies will show up in this list too (Fig. 7b)) by following the installation guidelines. A scheduler framework shows the structure of the scheduler for each selected scheduler policy.

Fig. 7 Scheduler selection tab GUI

The last step consists in clicking the button Compute Schedule to run the simulation. After a simulation run, the GUI allows to generates plots of the allocation and execution of task's jobs to CPUs and the CPUs temperature evolution. Fig. 7b)) shows the available buttons are: Save data, which save all the simulation data into a file, Plot, which plots task execution and CPU temperature evolution, and Heat map, which is only functional if the user configured the thermal parameters before the simulation.

5 Available Schedulers

This section is intended to show the available schedulers and their implementation in the framework. We only provide global multiprocessor schedulers by the reasons explained in Section 2. The scheduling policies available in the framework are an G-EDF ^([¹^]), an RT fluid scheduler RT-TCPN ^([¹⁵^]), and an RT-TCPNThermal aware scheduler that are describing below.

5.1 G-EDF

This scheduler implements a global EDF (G-EDF) algorithm. It is a global job-level fixed priority scheduling algorithm for sporadic task systems, which is optimal for implicit-deadline tasks with regard to soft RT constraints ^[²^]. Jobs are allocated to CPUs from a single queue. The highest priority is assigned to the job with the earliest absolute deadline. The signal jωialloc must be discrete. At each simulation step the scheduler can be written as: ωalloc=Φmexec. According to the EDF algorithm, the scheduling events are task activations and task completions, the only points at which task preemption can occur. The discrete function Φmexec is implemented by algorithm 2.

5.2 RT-TCPN

RT-TCPN is a scheduler based on the TCPN model presented in ^[¹⁵^]. It is composed of a global fluid scheduler and the discretization of the fluid scheduler. The fluid schedule computation is limited to every deadline, whereas preemption and context switch occur at every quantum, at which we check the difference between the actual and the expected fluid execution. RT-TCPN obtains a discrete schedule that closely tracks the fluid one ( mexec) by computing a schedule up to the hyperperiod ^([³^]).

The algorithm considers the ordered set of all tasks' jobs deadlines to define scheduling intervals, as in deadline partitioning ^([¹⁸^]). Each task τ_i must be executed ni=Hωi times within the hyperperiod H. Thus every q*w_i where q = 1,...,n_i is a deadline that must be considered in the analysis. These deadlines can be gathered and ordered in the set SDi=sdi1,…,sdini. A general set of deadlines is defined as SD=SD0∪…∪SDT where SD0=0. The elements of SD can be arranged in ascendant order and renamed as SD=sd0,…,sdα, where α is the last deadline. We define the the quantum Q as in ^[¹⁵^]. The fluid execution FSC=FSCT11,…,FSCTi,…,FSCTnmj is a vector with number of tasks × number of CPUs entries. The signal FSCTij stands for the desired allocation of the i - th task to the j - th CPU at time ζ. This function can be computed as:

FSCTiζ=βi×ccijHjζ, (12)

where βij is an unknown parameter that represents the number of jobs of task τ_i which are assigned to CPU_j per time unit. This value is used to compute a distributed fluid schedule function that considers temporal constraints, according to an offline stage that solves the following linear programming problem:

mins.t⁡∑j=1m∑Ta∈ Tβaj,∑j=1mβnj=Hωn∀i=1,…,n,,∑Ta∈ Tcca×βamH≤1∀j=1,…,m. (13)

The actual execution mi,jexec of τ_i in CPU_j must be equal to FSCTij. In the on-line stage a sliding mode controller yields an output signal jωialloc that is proportional to the error (difference between FSCTiζj and mi,jexec). The formal proof of this controller is given in ^[¹⁵^]. The scheduler ( ωiallocj) meets all deadlines assuming an infinitesimal share of the CPUs because its fluid nature, ans therefore must be discretized. The discrete scheduler must be written as: Walloc=Φmexec,FSC, where Φmexec,FSC is described by Algorithm 3. The computed discrete schedule Walloc matches the fluid schedule at every deadline sdk∈SD. Since these time points are the deadlines of jobs, then the algorithm ensures that the discrete schedule meets all deadlines of all tasks.

Fig. 8 depicts the scheme of the algorithm implemented in this framework. The dotted box A contains a set of signals that represent the normal behavior of the system. Signal A.1 describes the function FSCTiζj obtained in the off-line stage, and is used as the reference for the on-line stage. In the on-line stage the controller yields an output signal ( ωiallocj, named A.2 in Figure 8). The controller output is integrated by the TCPN model. Signal A.3 describes the output of this model. Finally, the algorithm RT-TCPN uses the difference between mi,jexec (the expected executed amount of τ_i in CPU_j ) and Mi,jexec (the actual executed amount of τ_i in CPU_j ) to compute Wiallocj i.e. the on-line allocation of τ_i to CPU_j at every quantum Q.

Fig. 8 RT-TCPN Scheme

5.3 RT-TCPN Thermal Aware

The schedule computation in this algorithm encompasses temporal and thermal constraints. The temperature of the chip must be kept under a temperature threshold T_max that depends on system design requirements. The fluid schedule function introduced in ^[³^] is extended to include thermal restrictions. This is achieved by considering the steady state of Eq. 7a, i.e. if the schedule is periodic, fluid and evenly distributed over the hyperperiod, then the temperature must reach a steady state. Thermal Eqs. (7a and 7b) can be rewritten as:

M˙T=ATMT+BTwalloc+B'Tma,YT=S'MT, (14)

AT corresponds to the system matrix, BT is the input matrix, and B'T conforms the matrix associated to the ambient temperature (m_a which is considered constant). These matrices are:

AT=CTΛTΠTCpexecΛexecΠexec0CpΛpΠp, (15)

B'T=CaΛaΠa0,BT=0Cpalloc. (16)

Thus, in a thermal steady state M˙T=0 , MT and YT are respectively renamed as MTss and YTss indicating a system steady state temperature, which can be computed as follows:

MTss=-AT-1BTwalloc+B'Tma,YTss=S'MTss. (17)

The steady state temperature YTssk of CPU_k must be less than or equal to its maximum temperature level i.e., YTssk≤Tmaxk so as not to violate the thermal constraint. In a vectorial form: S'MTss≤Tmax.

The thermal constraint is derived by combining the last expression and Eq. (17).

-S'AT-1BTwalloc≤Tmax+S'AT-1B'Tma. (18)

The task allocation vector ωalloc depends on the unknown parameter βij. These parameters are used to compute the distributed fluid schedule function that considers thermal and temporal constraints, according to the following linear programming problem:

mins.t.,⁡∑j=1m∑Ta∈ Tβaj

-SAT-1BT∑Taβacca1H,…,∑TaβaccamHT≤Tmax+SAT-1B'Tma,

∑j=1mβn=Hωnj∀i=1,…,n,

∑Ta∈ Tcca×βamH≤1∀j=1,…,m. (19)

The first constraint is the thermal constraint. The other two constraints are a straightforward extension of the time and CPU utilization used in the RT-TCPN algorithm for the multiprocessor case. The required fluid schedule function FSCTiζj is defined similarly as Eq. (12). The temporal and thermal requirements are accomplished as long as the tasks are executed according to this function. Task execution is represented by variable mi,jexecζ, hence the task execution error is defined as ei,jζ=FSCTiζj-mi,jexecζ . If ei,jζ=0, then tasks are executed at the adequate rate, and the time and thermal constraints are met. Therefore, this error can be kept equal to zero by appropriately selecting ωiallocζj. We propose a sliding mode controller for this purpose. The formal proof for this controller is beyond the scope of this paper. We provide the following key hints, nonetheless.

Considered the sliding surface:

Si,jζ=K1λi,jexecei,jζ+βi×ccijHλi,jexec-mi,jbusy, (20)

ωiallocζ=ω^iallocζ+K1λi,jexecjjβi×ccijH, (21)

where ω^iallocζj=K2signSi,jζ and signx=1 if x ≥ 0; 0 otherwise.

Figure 9 shows how the implemented algorithm works. It is composed of the off-line and online stages. Based on an LPP, the off-line stage computes the functions FSCTiζj to meet the temporal and thermal constraints. The on-line stage use these functions as the target for task execution. The sliding-mode controller continuously allocates tasks to CPUs to ensure that the task execution tracks the functions FSCTiζj. The dotted box A in figure 9 shows the set of signals that represent the normal behavior of the system. Box B depicts a disturbance, an unexpected behavior of the system such as a CPU detention. Signal B.1 describes the FSCTiζj, as in the normal case. If a CPU detention occurs at time ζ₁, then the difference between FSCTiζj and the actual execution of τ_i in CPU_j starts to increase at this time. Thus the controller output signal ( ωiallocj, named B.2) also starts to increase at time ζ₁. The controller output is integrated by the TCPN model (Eq. 8) producing an output mi,jexec (signal B.3) that is greater than FSCTiζj. Finally, the algorithm RT-TCPN Thermal aware computes Wiallocj, which also increases at time ζ₂. When the CPU resumes at time ζ₂, the controller allocates tasks to CPUs more often than in the normal case, using the CPU idle periods until normal operation is reached at time ζ₃ . Approaches that do not include a continuous controller are not able to recover from CPU detentions or other disturbances that can temporarily stop task execution.

Fig. 9 RT-TCPN Thermal-Aware Scheme

6 Example

We illustrate the usage of the simulation framework with the following experiments. First, we compare the temperature variations of the simulated system as generated by the available schedulers.

In the experiments we assume a platform composed of two homogeneous 1cm × 1cm silicon microprocessors mounted over a 5cm × 5cm copper heat spreader as in ^[¹⁷^]. The thickness of the silicon microprocessors and the copper heat spreader are 0.5mm and 1mm respectively.

The experiment considers the task set T=T1,T2,T3. The maximum operating temperature of the cores is set at Tmax1,2=80oC. The temporal parameters of each task are its WCET (in CPU cycles), period ω and deadline d (with ω = d in this case), resulting in T1=2×109,4,4,35,T2=3×109,8,8,40,T3=3×109,12,12,45 and the consumed energy e.

Fig. 10 depicts the temperature obtained by the schedulers for both CPUs. G-EDF and RT-TCPN obtain a feasible schedule, however the resulting temperatures violate the thermal constraint. In contrast, the RT-TCPN Thermal scheduler meets both thermal and temporal requirements. Fig. 11 presents the heat map obtained by the simulator.

Fig. 10 Temperatures comparison obtained by the implemented schedulers

Fig. 11 Heat map obtained by a) G-EDF, b) RT-TCPN c) RT-TCPN Thermal Aware

7 Conclusion and Future Work

Designing, testing and comparing RT scheduling methods on multiprocessors is a gruesome and time consuming chore, all the more when thermal restrictions are considered. Resorting to a real system implementation is overkilling during the early design stages. We have developed a simulation framework which encompasses modules for defining and modelling tasks, CPUs, thermal properties and three global RT schedulers out-of-the box. It is available at https://www.gdl.cinvestav.mx/art/uploads/ SchedulerFrameworkTCPN.zip and it is distributed as open source software as-is.

The main contribution compared with different real time simulation tools relies on its capability of handling temperature analysis over different scheduling policies, which can be very useful in order to detect thermal management problems that can be solved trough a correction in the scheduling algorithms.

We are working to include a number of improvements such as simpler procedures to replace or customize the thermal, task and CPUs models, additional algorithms to generate task sets and thermal-energy aware schedulers. Following the trend of some current multiprocessors, the framework will also include models for heterogeneous CPUs with per-CPU frequency adjustments.

Acknowledgements

This work was partially supported by grants TIN2016-76635-C2-1-R (AEI/FEDER, UE), gaZ: T48 research group (Aragón Gov. and European ESF), and HiPEAC4 (European H2020/687698).

References

1. Baker, T. P. (2005). A comparison of global and partitioned EDF schedulability tests for multiprocessors. International Conf. on Real-Time and Network Systems. [ Links ]

2. Baruah, S., Bertogna, M., & Butazzo, G. (2015). Multiprocessor Scheduling for Real-Time Systems. Springer-Verlag New York, Inc., Secaucus, NJ, USA. [ Links ]

3. Baruah, S. K., Cohen, N. K., Plaxton, C. G., & Varvel, D. A. (1996). Proportionate progress: A notion of fairness in resource allocation. Algorithmica, Vol. 15, No. 6, pp. 600-625. [ Links ]

4. Bini, E. & Buttazzo, G. C. (2005). Measuring the performance of schedulability tests. Real-Time Systems, Vol. 30, No. 1-2, pp. 129-154. [ Links ]

5. Brandenburg, B., Block, A., Calandrino, J., Devi, U., Leontyev, H., & Anderson, J. (2007). LITMUSRT: A status report. Proceedings of the 9th real-time Linux workshop, pp. 107-123. [ Links ]

6. Brandenburg, B. B. & Gül, M. (2016). Global scheduling not required: Simple, near-optimal multiprocessor real-time scheduling with semi-partitioned reservation. IEEE Real-Time Systems Symposium (RTSS 2016), pp. 99-110. [ Links ]

7. Buttazzo, G. (2011). Hard real-time computing systems: predictable scheduling algorithms and applications, volume 24. Springer Science & Business Media. [ Links ]

8. Calandrino, J. M., Anderson, J. H., & Baum-berger, D. P. (2007). A hybrid real-time scheduling approach for large-scale multicore platforms. Proceedings of the 19 th Euromicro Conference on Real-Time Systems, ECRTS '07, IEEE Computer Society, Washington, DC, USA, pp. 247-258. [ Links ]

9. Casini, D., Biondi, A., & Buttazzo, G. (2017). Semi-Partitioned Scheduling of Dynamic Real-Time Workload: A Practical Approach Based on Analysis-Driven Load Balancing. Bertogna, M. , editor, 29th Euromicro Conference on Real-Time Systems (ECRTS 2017), volume 76 of Leibniz International Proceedings in Informatics (LIPIcs), Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, pp. 13:1-13:23. [ Links ]

10. Chandarli, Y., Fauberteau, F., Masson, D., Midonnet, S., & Qamhieh, M. (2012). Yartiss: A tool to visualize, test, compare and evaluate real-time scheduling algorithms. WATERS 2012, UPE LIGM ESIEE, pp. 21-26. [ Links ]

11. Chéramy, M., Hladik, P.-E., & Déplanche, A.-M. (2014). Simso: A simulation tool to evaluate real-time multiprocessor scheduling algorithms. 5th International Workshop on Analysis Tools and Methodologies for Embedded and Real-time Systems (WATERS), pp. 6-p. [ Links ]

12. David, R. & Alla, H. (2008). Discrete, continuous and hybrid Petri nets. Control Systems, IEEE, Vol. 28, No. 3, pp. 81-84. [ Links ]

13. Desel, J. & Esparza, J. (1995). Free Choice Petri Nets. Cambridge Tracts in Theoretical Computer Science 40. [ Links ]

14. Desirena, G., Rubio, L., Ramirez, A., & Briz, J. (2019). Thermal-aware hrt scheduling simulation framework. [ Links ]

15. Desirena-Lopez, G., Briz, J. L., Vázquez, C. R., Ramírez-Treviño, A., & Gómez-Gutiérrez, D. (2016). On-line scheduling in multiprocessor systems based on continuous control using timed continuous petri nets. 13th International Workshop on Discrete Event Systems, pp. 278-283. [ Links ]

16. Desirena-Lopez, G. , Ramírez-Treviño, A. , Briz, J. L., Vázquez, C. R. , & Gómez-Gutiérrez, D. (2019). Thermal-aware real-time scheduling using timed continuous petri nets. ACM Transactions on Embedded Computing systems . To appear, accepted Apr. 2019. [ Links ]

17. Desirena-Lopez, G. , Vázquez, C. R. , Ramírez-Treviño, A. , & Gómez-Gutiérrez, D. (2014). Thermal modelling for temperature control in MPSoC's using fluid Petri nets. IEEE Conference on Control Applications part of Multi-conference on Systems and Control. [ Links ]

18. Funk, S., Levin, G., Sadowski, C., Pye, I., & Brandt, S. (2011). Dp-fair: a unifying theory for optimal hard real-time multiprocessor scheduling. Real-Time Systems, Vol. 47, No. 5, pp. 389-429. [ Links ]

19. Kato, S. & Yamasaki, N. (2007). Real-time scheduling with task splitting on multiprocessors. Proceedings of the 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, RTCSA '07, IEEE Computer Society, Washington, DC, USA, pp. 441-450. [ Links ]

20. MATLAB (2018). version 9.4 (R2018a). The MathWorks Inc., Natick, Massachusetts. [ Links ]

21. Oh, D.-I. & Bakker, T. (1998). Utilization bounds for n-processor rate monotone scheduling with static processor assignment. Real-Time Systems, Vol. 15, No. 2, pp. 183-192. [ Links ]

22. Rubio-Anguiano, L., Desirena-López, G., Ramírez-Treviño, A. , & Briz, J. (2018). Energy-efficient thermal-aware scheduling for rt tasks using tcpn. IFAC-PapersOnLine, Vol. 51, No. 7, pp. 236-242. [ Links ]

23. Silva, M., Júlvez, J., Mahulea, C., & Vázquez, C. R. (2011). On fluidization of discrete event models: observation and control of continuous Petri nets. Discrete Event Dynamic Systems, Vol. 21, No. 4, pp. 427-497. [ Links ]

24. Silva, M. & Recalde, L. (2007). Redes de Petri continuas: Expresividad, análisis y control de una clase de sistemas lineales conmutados. Revista Iberoamericana de Automática e informática Industrial, Vol. 4, No. 3, pp. 5-33. [ Links ]

25. Singhoff, F., Legrand, J., Nana, L., & Marcé, L. (2004). Cheddar: a flexible real time scheduling framework. ACM SIGAda Ada Letters, volume 24(4), ACM, pp. 1-8. [ Links ]

26. Srinivasan, J., Adve, S. V., Bose, P., & Rivers, J. A. (2005). Exploiting structural duplication for lifetime reliability enhancement. Computer Architecture, 2005. ISCA'05. Proceedings. 32nd International Symposium on, IEEE, pp. 520-531. [ Links ]

¹ Available at: https://www.gdl.cinvestav.mx/art/uploads/SchedulerFrameworkTCPN.zip

Received: October 24, 2018; Accepted: February 16, 2019

^* Corresponding author is Gaddiel Desirena Lopez. gdesirena@gdl.cinvestav.mx

This is an open-access article distributed under the terms of the Creative Commons Attribution License