^{1}

^{*}

^{1}

^{2}

^{2}

^{1}

^{1}

^{2}

Edited by: Gerrit C. Van Der Veer, University of Twente, Netherlands

Reviewed by: Andrej Košir, University of Ljubljana, Slovenia; Gualtiero Volpe, Università di Genova, Italy

This article was submitted to Human-Media Interaction, a section of the journal Frontiers in ICT

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Interactive optimization methods are particularly suited for letting human decision makers learn about a problem, while a computer learns about their preferences to generate relevant solutions. For interactive optimization methods to be adopted in practice, computational frameworks are required, which can handle and visualize many objectives simultaneously, provide optimal solutions quickly and representatively, all while remaining simple and intuitive to use and understand by practitioners. Addressing these issues, this work introduces SAGESSE (Systematic Analysis, Generation, Exploration, Steering and Synthesis Experience), a decision support methodology, which relies on interactive multiobjective optimization. Its innovative aspects reside in the combination of (i) parallel coordinates as a means to simultaneously explore and steer the underlying alternative generation process, (ii) a Sobol sequence to efficiently sample the points to explore in the objective space, and (iii) on-the-fly application of multiattribute decision analysis, cluster analysis and other data visualization techniques linked to the parallel coordinates. An illustrative example demonstrates the applicability of the methodology to a large, complex urban planning problem.

Making a decision involves balancing multiple competing criteria in order to identify a most-preferred alternative. For simple, day-to-day decisions, this can usually be done by relying on intuition and common sense alone.

For larger, more complex decisions, common sense may not suffice, and multicriteria decision analysis (MCDA) can be used to formalize the problem, both improving the decision and making it more transparent (Keeney,

To make better decisions requires a clear knowledge of the available alternatives. However, research has shown that without adequate support, the identification of alternatives is difficult and often incomplete, even for experts in a field (León,

However, solving multiobjective problems implies that, contrary to single-objective problems, not one well-defined solution is found, but a set of equally interesting

When considered collectively as a

A posteriori, once all the Pareto optimal solutions have been identified. The advantage is that the decision maker has a complete overview of the available options. On the other hand, the calculation can be extremely long if the solution space is vast, the decision maker may not have the time to wait, and the process will certainly compute many wasteful solutions which are of little interest. Even if time were not an issue, the difficulty to visualize, interpret and understand the Pareto optimal results can compromise the trust from the DM, especially when more than three objectives are considered.

A priori, before starting any calculations. This is the most efficient approach, as theoretically only one solution is calculated. However, it is also probably the most difficult from the decision maker's point of view, as it assumes that they are perfectly aware of their preferences and acceptable tradeoffs, and are able to formulate them precisely. In practice, when dealing with complex and interdisciplinary problems, this knowledge is generally unavailable until the solutions are calculated, and therefore the risk of reaching an infeasible or unsatisfactory solution is high (Meignan et al.,

Interactively, as the optimization progresses. This is a common response to the limitations of a priori and a posteriori approaches. By involving the human decision maker directly in the search process (Kok,

The goal of this paper is to highlight the current gaps in literature which are limiting the application of IO to large problems, and propose a novel methodology addressing these gaps.

Interactive optimization consists of four main components which are combined to form a human-computer interaction system: a user, a graphical user interface (GUI), a solution generator and an analyst (Figure

Main components and flows of information in interactive optimization. GUI: graphical user interface. Adapted from Spronk (

During a preparatory

The main premise for human-computer interaction is that complex problems can be better solved by harnessing the respective strengths of each party (Fisher,

The relative strengths of the computer are in counting or combining physical quantities, storing and displaying detailed information and performing repetitive tasks rapidly and simultaneously over long periods of time (Shneiderman,

There are several compelling benefits of involving a human user in the interactive optimization process. First, the incorporation of expert knowledge, intuition and experience can compensate the unavoidable simplifications induced by the model (Meignan et al.,

On the other hand, the main drawbacks are that IO methods rely on the assumptions that a human DM is available, that they are willing to devote time to the solution process, and that they are able to understand the process, inputs asked of them, and resulting outcomes (Hwang and Masud,

Over the past decades, a variety of interactive optimization methods have been developed, with efforts both in improving the underlying search procedures, and interaction mechanisms. (Kok,

Many efforts have been done to review and synthesize the technical developments in the field of interactive optimization. The earlier developments of search-based methods are described by Hwang and Masud (

Beyond the underlying technical aspects of interactive optimization, growing interest has been devoted to the learning opportunities which it provides. In this vein, Klau et al. (

Allmendinger et al. (

The need for intuitive visualization of multiobjective optimization results and interaction with the optimizer has been recognized as a central issue (Xiao et al.,

Much attention has been given to advanced visualization methods for results of

Several studies investigated the practical applicability of parallel coordinates in the context of multiobjective optimization. Akle et al. (

Given the widespread attention received by parallel coordinates, their adoption in the context of multiobjective optimization is not surprising. However, their use remains predominantly confined to

A selected number of methods are described hereafter, outlining their responses to the issues above as well as the remaining gaps. For a more extensive overview of existing approaches, we refer to Branke et al. (

The Pareto Race tool developed by Korhonen and Wallenius (

Another navigation method is Pareto Navigation (Monz et al.,

The approach developed by Miettinen and Mäkelä (

In a separate work, Miettinen et al. (

Babbar-Sebens et al. (

Stump et al. (

A wide variety of preference types, procedures and interfaces for interactive methods emerge from the existing literature. However, in spite of the early efforts in developing effective search-based procedures, and the more recent efforts in making tools which enable user learning, there remains a slow progression of “application-oriented” methods, which succeed in being adopted outside of academia and for addressing real, large-scale problems (Gardiner and Vanderpooten,

First and foremost, methods must have the ability to handle many objectives, and produce many efficient alternatives reflecting the complexity of real-world problems. Xiao et al. (

The previous requirement leads to the need for methods which are capable of overcoming the associated computational burden. It is crucial that results are delivered promptly to reduce latency time for users, whose willingness to participate might otherwise be compromised (Collette and Siarry,

Visualization approaches for multiobjective optimization results are equally important, and have been extensively reviewed by Packham et al. (

Finally, a simple and intuitive interface is necessary to top the aforementioned requirements. The user must be able to not only easily understand the results, but also steer the process with minimal effort. However, the use of complex jargon, and difficult inputs are still considered barriers against a wider adoption of interactive methods in practice (Cohon,

Summary of requirements for “application-oriented” interactive optimization methodologies. The key features from the methodology proposed in this paper (SAGESSE) and their relationship to the requirements are indicated on the right.

While the methods reviewed above address one or several of these requirements, none addresses them all simultaneously. The objectives of this paper are thus (i) to introduce a new interactive optimization methodology addressing the requirements in Figure

SAGESSE – for

Novel paradigm for interactive multiobjective optimization, where generation, exploration and steering are performed continuously instead of sequentially.

Figure

Overview of components, workflow and main software involved in the interactive optimization methodology and case-study. Gray text indicates optional tasks.

When accessing the interface, the user can either start a new project, or reload an existing one. For a new project, by default an empty parallel coordinates chart with preselected criteria is displayed. An advantage of starting from an empty chart is that it attenuates the risk of anchoring bias, which may cause the user to fixate too soon on possibly irrelevant starting solutions, at the expense of exploring a wider variety of solutions (Miettinen et al.,

Snapshot of the graphical user interface demonstrating several features of the SAGESSE methodology, including axis and polyline styling, multiattribute and cluster analysis results, and the axis selection menu. The line color indicates the belonging of a line to one of the three clusters (bold axis label), while the line thickness is proportional to total costs (italic axis label).

The user can influence the search in two ways: by providing inputs which influence either the optimization model, or the optimization procedure (Figure

Steering features.

Model inputs | - Objective |
Steer the search toward relevant areas of the solution space |

Procedure inputs | - Sampling method |
Control quality, scope and duration of calculations |

Steering aids | - Authorized actions for model inputs (pointer shapes, axis style) |
Guide the user toward feasible and meaningful actions |

The user specifies their preferences directly on the parallel coordinates chart which is used to display the solutions. This is done by brushing the axes to be optimized or constrained (Martin and Ward,

There are three associated steering actions performed in Step 1, which will characterize the axes and the role they play in the the optimization procedure in Steps 3 and 4 (Figure

The first action consists in defining the main

The second action consists in marking one or several axes as

The third action allows to systematically vary the value of a parametrized constraint within the boundaries of a brushed

Finally, for any criterion marked as either of the above actions, its preferred direction can be specified. For example, a cost criterion's preference will be “less,” indicating that less of that criterion is preferred to more. A benefit criterion will be “more,” as more is preferred to less. Practically, “less” results in

The first type of input regarding the optimization procedure is the stopping criteria for the solver, i.e., a solving time limit, and an optimality gap limit. The optimality gap is a useful feature specific to deterministic global optimization, which allows to produce solutions “that differ from the optimum by no more than a prescribed amount” (Lawler and Wood,

A second input is the sampling method to be employed within the specified ranges, and an associated number of solutions to be sampled (see section 2.5). A third input consists in the desired scope or boundary of the problem. For example, in the case of urban planning, the perimeter to be considered in the problem can be increased or reduced.

While in principle these inputs could also be made directly via the parallel coordinates, they are specified here with buttons, forms and drop down menus. Except for the problem boundary, it should be noted that these inputs are typically predefined by the analyst, and do not require particular understanding from the user. They are rather intended for more experienced users and modelers.

Given the different types of content that can be displayed on the axes of the parallel coordinates chart, their typology and permitted steering actions must be clearly and intuitively conveyed to the user. The use of colored brushes, different axis styles and textual tooltips are used for this purpose. The axes can display two main types of information:

Methodology-specific information is displayed on axes with a dashed line style (Figure

Context-specific information generated by the optimization model is displayed on axes with a continuous line style (Figure

Another feature to assist the user in steering consists in highlighting the axes which played an active role in the optimization problem (Figure

A relational database is used to store both the data provided by the user in the interface (e.g., project details, raw steering preferences), and the data produced by the solution generator engine (e.g., problem formulations, solution results and related metadata). The data model for interactive optimization which was developed for the present methodology is described by Schüler et al. (

Once the user has specified the desired criteria to optimize (i.e., using objective and range brushes in Step 1), the goal is to solve the following generic multiobjective optimization problem, assuming without loss of generality all minimizing objectives (Collette and Siarry,

where the vector ^{k} contains the ^{q} are the inequality constraints, ^{r} are the equality constraints, and ^{d} are the ^{d}, whose values are to be determined by the optimization procedure.

In principle, this problem could be solved with either a deterministic or a heuristic method (cf. section 1.1). However, in order to benefit from widely available and efficient optimization algorithms such as the simplex or branch-and-bound algorithms (Lawler and Wood,

Scalarization functions have three key requirements in the context of interactive methods (Branke et al.,

where _{n, i} leading to Pareto optimal solutions. A first limitation of the WSM is that if the Pareto front is non-convex, the scalar function is not capable to generate solutions in that area (Branke et al.,

In the ϵ-constraint method, introduced by Haimes et al. (

where _{n, j} are parameters representing the upper bounds for the auxiliary objectives _{j} unique upper bounds for each objective are determined within a range of interest _{n, j}, i.e., for a total of

Schematic comparison of _{l}. Blue: arbitrary range of interest in the auxiliary objective _{j}, in which the upper bounds ϵ_{r, j} are automatically allocated by the sampling method (note: the ticks indicate the relative position of the constraints for a normalized range, and the subscripts

Conceptually, the ϵ-constraint method can be understood as the specification of a virtual grid in the objective space, and solving the single-objective optimization problem for each of the

In addition to that, the original ϵ-constraint method in Equation (3) can be reformulated as a multiparametric optimization problem (Pistikopoulos et al., _{n, j} of the auxiliary objectives are varied, but also any other model parameter θ_{t} in the vector ^{m}. Thus, assuming without loss of generality all minimizing functions, the

where _{j} or to a model parameter θ_{t}) are referred to as ϵ_{n, p}, where

Referring to the steering actions performed by the user and defined in section 2.3.1, the brushed _{l} in Equation (4), while the lower and upper bounds of brushed _{j}(_{d}, the user can also control and vary individual decision variables directly. Finally,

Despite the advantages of the ϵ-constraint method, Chankong and Haimes (_{n, p} bounds in the incremental fashion described above. As such, and especially when many dimensions are involved, the generation of solutions using the ϵ-constraint method can be time-consuming and uneven across the objective space when interrupted prematurely, leading to a poor representation of the Pareto front (Collette and Siarry, _{n, p} in Equation (4) is discussed next.

Several studies have investigated ways to improve the determination of parameters in the ϵ-constraint method. For example, Chircop and Zammit-Mangion (

In SAGESSE, the quasi-random Sobol sampling method (Sobol,

With the Sobol sampling approach, the user specifies a number of solutions

where _{n, p} is an element in the matrix _{N×P}, whose rows contain the Sobol sequence of _{N×P}. Here, a Python implementation based on Bratley and Fox (^{sob}, Equation (7). This choice of range further implies that in this example, the coordinates of the parameters are in fact identical to those of the Sobol sequence in a unit hypercube:

Alternatively, a standard systematic sampling method can also be used (Gilbert, _{p} is the number of requested points in the range of interest

Therefore, each dimension thus contains _{p} unique values to sample, computed as:

where _{n′, p}. The corresponding matrix ^{sys} resulting from systematic sampling is then populated by combining all parameter values in the following order:

As an example, for three dimensions sampled with systematic sampling between [0, 1] and for _{1} = 3, _{2} = 2, _{3} = 2, the resulting matrix of varied parameters is:

In this step, the single-objective optimization problems formulated based on Equation (4) are solved. In particular, the solver receives from the client the main objective to optimize, as well as the values for all specified parameters contained in _{N×P}. As long as the user has not specified new objectives on the parallel coordinates chart, the generation process continues to add solutions in the current ranges, taking as inputs the rows of _{N×P} one after another. As soon as a change in objectives occurs, the solver interrupts the current sampling sequence and starts again with the newly provided objective and _{N×P}.

The purpose of exploration is for the user to learn about tradeoffs and synergies between the solutions, and develop their confidence in what qualifies a good solution. The interface should offer a positive and intuitive experience, respecting the information-seeking mantra “overview, filter, details on demand” (Shneiderman,

Adopted exploration features related to the parallel coordinates interface, classified by type (O: overview, F: filter, D: details on demand).

Polyline color | Single color, Linear gradient, Z-score, Categorical colors | O | Identify patterns and clusters across axes | Shneiderman, |

Polyline width | Customizable scale | O | Identify patterns and clusters across axes | - |

Polyline curve | Customizable curve intensity | O | Identify patterns and clusters across axes Avoid ambiguities | Franken, |

Axis choice and ordering | Drag-and-drop, Drop-down menu | O, F | Identify patterns and clusters across axes, Avoid redundancy | Jaszkiewicz and Słowiński, |

Clustering | k-medoids | O, F | Focus attention on few distinct and representative solutions, group similar solutions together | Kaufman and Rousseeuw, |

Brushing | 1D, 2D, angular, etc. | F | Avoid cluttering, Provide additional information related to brushed polylines | Shneiderman, |

Hovering | - | F/D | Avoid cluttering, Provide additional information related to hovered polyline | - |

Multiattribute decision analysis | TOPSIS | F/D | Provide aggregated score and ranking to facilitate interpretation of MODA results | Hwang and Yoon, |

Linked views | 3D scatter plots, 2D scatter plot matrices Maps | D | Overcome and complement visual limitations of parallel coordinates, Avoid visual overload | Buja et al., |

The parallel coordinates reveal tradeoffs (or negative correlations) between two axes as crossing lines, and synergies (or positive correlations) non-crossing lines (Inselberg,

The user can filter the polylines to display only those of interest by “brushing” the desired axes (Heinrich and Weiskopf,

Another way to filter the displayed information concerns the visible axes representing the criteria. While parallel coordinates scale well to large numbers of criteria (Inselberg,

The use of clustering techniques is a common approach to help make the selection of solution from a large Pareto optimal set more manageable (Aguirre and Taboada,

When many solutions are compared across many dimensions, it can become overwhelming to distinguish which stand out overall. Psychological studies have emphasized the limited ability of human decision makers in balancing multiple conflicting criteria, even between a limited number of alternatives (French,

As pointed out in the introduction (Hwang and Masud,

To avoid burdening the user with further methodological aspects, the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) method is adopted for its most intuitive and understandable principle, limited need for inputs and ability to handle many criteria and solutions (Zanakis et al.,

Two methodological aspects must be in particular considered in the TOPSIS method, namely the normalization of data, and the choice of ideal solutions.

First, to account for different scales in the criteria, values must be normalized for comparability (Kaufman and Rousseeuw,

Regarding the choice of ideal solutions, while the original TOPSIS methods computes them

In some cases, the content and format of parallel coordinates is not sufficient or adapted to convey certain types of information. Xiao et al. (

To address these gaps, two features are implemented. The first allows to visualize all (or subsets of) solutions in interactive 2D scatter plot matrices and 3D scatter plots (Plotly-Technologies-Inc.,

In the urban planning case-study presented in section 3, clicking a polyline triggers the generation of geographic maps, which complement the parallel coordinates chart with spatial and morphological information, providing also a more detailed insight into the decision variables of a solution (location, size, and type of buildings, energy technologies, etc.).

The ability to effectively convey key analytical information from the methodology to complement the decision maker's intuitive and emotional thought process is essential to influence the decision process. Studies performed by Trutnevyte et al. (

A “comparer dashboard” is thus developed to address this need, synthesizing with key information the main insights gained during the search process (Figure

The SAGESSE methodology is targeted at supporting decisions in large problems with many decision variables, and for which preferences are difficult to specify

It is unlikely that a single stakeholder possesses a clear vision of the needs of all actors, and even less so a precise and quantifiable understanding of the tradeoffs and synergies among them.

Political targets and trends evolve more rapidly than the realization of urban projects.

There is a gap between the strategic planning phase, and the more concrete design phase, making it difficult to anticipate consequences of early decisions (deVries et al.,

Supporting decisions in the early stages of an urban redevelopment project can therefore particularly benefit from interactive optimization, given the large solution space (choice of type, size and location of buildings and energy equipment), and elusive preferences and values involved (Cohon,

A case is demonstrated hereafter for searching a preferred alternative in the redevelopment of an urban neighborhood, illustrating the various exploratory and steering features described above.

The case-study presented here consists of a redevelopment project for a Swiss neighborhood of roughly 300 buildings. The main conflicting objectives, which the planners must achieve, include: (i) increasing the residential built density, (ii) reducing the overall greenhouse gas emissions, and (iii) promoting quality of life. The analysis phase leading to the optimization model consisted in multiple workshops with the urban and energy planning team, as well as a review of available master plans and legal documents. During this phase, the perimeter of the project was determined, as well as the main issues, objectives and constraints faced by the local experts. In the search phase of SAGESSE, the user does not repeat this entire process, but nevertheless begins by selecting the project boundaries, and key criteria they wish to explore. Cajot et al. (^{io} introduced in Cajot et al. (

After creating a new project, the user faces an empty chart. This blank chart forces them to think of the most valued aspects in the project, in other words, what they are trying to achieve. For example, they might start exploring the tradeoffs between three important and often conflicting criteria in urban planning: the floor area ratio (FAR) or built density (to be maximized), the renewable energy sources (RES) share (also to be maximized) and the total costs associated to the corresponding decisions (to be minimized). If they know

Parallel coordinates charts showing different steering inputs (left) and resulting solutions computed with Sobol sampling (right).

A first question here is:

As the solutions requested in Figure

Continuing this process, the user can answer further questions, such as:

By including the decentralized oil boiler axis via the drop-down menu (Figure

The reduction in number of oil boilers has little effect on the total costs and performance of RES share compared to the solutions which included them. To explain this lack of effect, the user could further explore different criteria concerning the oil boilers and other technologies to find out their respective contributions to the neighborhoods energy supply. In this case however, a cartographic representation of the annual energy supply per building and per energy technology is more adapted to provide an overview of all technologies. The maps of solutions containing the oil boilers (not shown) are in fact similar to those without oil boilers (e.g., Figure

Repeating this process for wood boilers, which are also to be avoided in urban centers because of health-related issues, the user adds a new constraint on the wood boiler axis, and requests five new solutions (Figure

At this point, the user could continue by inquiring e.g., social or economic questions, such as the distribution of costs between building owners and energy provider, the impact of increased density on the view of aesthetic landmarks, etc. As the solution generation process evolves, however, the number of solutions and criteria rapidly grows. This is where MADA and cluster analysis can further support the exploration. To develop a general understanding of which solutions perform best, the TOPSIS method is applied on-the-fly to the current solutions, and colored accordingly (Figure

Another way to cope with the many solutions is to perform cluster analysis to identify the few most representative solutions. In Figure

Depiction of the k-medoids cluster analysis results in 3D scatter plots for increasing number of clusters k. Colors indicate similar solutions belonging to a same cluster, and black diamonds indicate the representative solution (medoid) in each cluster.

After generating and exploring several alternatives, the user can narrow down the number of solutions to only a subselection of the most promising ones and add them to the comparer dashboard (Figure

Comparer dashboard containing three representative solutions from a cluster analysis on the chosen criteria. The thumbnails depict the buildings in each solutions, colored by share of energy performance certificates (“Share perf. cert.”) adopted. Green fonts indicate the best performing values, red the worst.

By personally going through the systematic search process, the user has gained a better understanding of the problem and of their own preferences. New questions were raised along the way, which could be answered on-the-fly. The main learning points from this demonstration can be summarized as: knowledge of the maximum achievable density, required costs to achieve a highly renewable energy neighborhood, and the corresponding density threshold, as well as the maximum RES share achievable in the absence of wood boilers. Overall, this knowledge of extreme cases, but also the finer understanding of tradeoffs and tipping points between conflicting objectives gained during the search phase, give the user more confidence in justifying the chosen solutions, or the reasons why others were discarded. In addition, by laying down side-by-side the main criteria for a subselection of solutions in the synthesis phase, the user is equipped to take an informed decision, and justify and communicate it to other stakeholders.

A novel interactive optimization methodology was presented, which enables users to simultaneously generate and explore solution spaces of large problems in real-time. It aimed at addressing the four main gaps in current interactive optimization methods, namely the ability to handle many objectives and alternatives, to explore the latter in an efficient way, to communicate the results effectively, and remain overall simple and accessible to users unfamiliar with optimization. Furthermore, it was demonstrated and applied in an urban system design problem. The contributions are summarized hereafter.

smart ordering of axes, e.g., by exhaustive pairwise depiction (Heinrich and Weiskopf,

improved visualization of polylines and patterns, e.g., visually bundling clusters into compact polygons (Palmas et al.,

handling of time series, e.g., using the third dimension to display temporal evolutions (Gruendl et al.,

dynamic rescaling of axes, to allow brushing beyond the visible values (currently brushing beyond the visible axis requires manually editing the numerical brush bound in an

Whereas traditional interactive methods tend to clearly distinguish the learning phase, the preference articulation phase, and the generation of solutions, the proposed methodology blends these three phases into an integrated, more immersive experience. Indeed, the exploration of solutions need not be interrupted while the optimization is running: as soon as a solution is found, it is directly included into the chart—and into the user's mind—for the user to interpret. Through various exploratory features, the user becomes mindful of the relationships between criteria, and how much they are willing to sacrifice in one, in order to gain value in others. Each new solution makes more clear the critical contradictions to be resolved, allowing to refine the search toward areas which are found most relevant.

Because both the phases concerning the human and those concerning the computer occur at the same time, the whole can be considered an “optimization-based thought process.” The computer optimization becomes an extension of the user's mind, while at the same time, the user's mind becomes an extension of the optimization. Vinge (

Finally, the acronym of SAGESSE (French for

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.