ASSESSMENT OF DUCTILE IRON CASTING PROCESS WITH THE USE OF THE DRSA METHOD

The paper introduces a concept of assessment of a ductile iron casting process with use of the rule-based approach, known as DRSA (dominance-based rough set approach). The research was conducted in a large Polish foundry. The collected data concern the chemical composition and mechanical properties of the used ductile cast iron. In the paper, a methodology of creating a rule-based moulding model for the tensile strength was proposed. The quality, sensitivity and accuracy of the model extracted from the data were examined. The studies proved its usefulness in the industrial practice and for aiding of the decision making process.


Introduction
An important factor determining the competitiveness of a manufacturing enterprise is its ability to timely deliver products of the specified quality at an agreed price. It results from effective management of the production process, whose main element is the manufacturing process. Manufacturing process creates the added value and is directly related to the change of shape, size, surface quality, physicochemical properties or appearance of the processed material, or the change of the mutual position of the product parts [1]. In the manufacturing process control (and more specifically in the control of the given technological operation-in quality engineering the process is often understood even as a single operation, where at the beginning resources are supplied, and the output is a finished product or a processed material) it is of particular importance to be able to conduct ongoing (on-line) assessment of the process. The process state can be described by a set of measures -characteristics of the process. By knowing them it is possible to predict the quality of the products (semi-finished goods) which are the output of the production process, and effectively control the process by taking any corrective actions.

Quality control of foundry process on the example of ductile iron
In each manufacturing process it is important to find the relation between process parameters and the properties and quantities of materials used on the one hand, and on the other -properties of products. The model of these relations is often called a quality model [2,3].
In the case of foundry processes its construction is particularly difficult, which is associated with the degree of process complexity (Fig. 1).
A vital problem in foundry process control lies in the acquisition of data.
Foundries are widely regarded as under takings in which it is difficult to maintain the traditional procedures for manufacturing, quality, timeliness and continuity of production. This is due to the fact that the production and quality control of castings is difficult: each has its own gating system, moulding sand and core sand, and they are performed in different conditions, in many departments and by many employees [2].
A typical casting production process includes approx. 100 parameters, which may affect its course. It is often very difficult or even impossible to find the relations between them. It is particularly difficult when the parameters are obtained at different stages of the process or when they pertain to new processes. Hence, casting processes are considered non-algorithm based, or processes unrelated to any simple mathematical model [2]. This leads to the modeling of casting processes taking into account a limited set of parameters, the values of which can be properly attributed to a particular casting or batch of castings in For ductile iron, it is considered that the melting process has primary influence on its structure and properties. In general, it depends on the chemical composition of the melt, the physical properties of the molten metal and the cooling rate of the casting. Among these factors, the decisive is the content of 9 elements in the melt: carbon, silicon, manganese, phosphorus, sulphur, chromium, nickel, copper, and magnesium (C, Si, Mn, P, S, Cr, Ni, Cu, Mg), which are usually controlled under the melting process control [5].
Quality control in ductile cast iron is most often based on the measurements of tensile strength and elongation. Sometimes, however, additional properties are specified, such as hardness [5].
Quality assessment of ductile iron casting process (assessment of process state), which takes place on the basis of a number of state measures (including, for example, solely the content of elements in the melt), is a multicriteria classification problem.

Methods to evaluate the process state
The decisions concerning the quality of the casting process are made usually based on the data provided by control: one hundred percent, acceptance sampling or statistical process control [6]. The evaluation is usually performed "post factum": when the operations end, one or more critical attributes are measured, and process evaluation is made based on the measurement. Also, the industrial measurement systems should be analyzed in order to confirm their adequacy for the measuring tasks and reliability of the data obtained from measuring process [7].
In the era of aiming for zero-defects production (or zero-defects manufacturing), one cannot be satisfied with evaluation of the manufacturing process state based on one (or even several) critical attribute of the product. It is particularly important to take into account such process attributes as: process parameters, diagnostic signals accompanying the process and events that occurred during the process. A collection of such data is the starting point for developing a process model, which can be used to predict its future states.
Assessing the state of a process based on attribute datasets poses a problem of classification. Classification methods have been developed by researchers in many areas. Classical methods of process quality assessment include approaches within the Statistical Process Control, i.e. capability indices, control charts and approaches of process evaluation based on the number of defects in process output [9,10]. The second group of methods used to assess the process state includes the broadly understood Data Mining methods that allow for the 26 Figure 1. Quality model of manufacturing processes in a typical foundry [4] construction of the process model (classifier) [11]. According to one of the approaches, to create a classifier of the process state the historical data set must be acquired (Fig. 2). The data describing a specific implementation of the manufacturing process include: values of parameters, diagnostic signals, and events in the process, as well as the quality assessment of the process output. The evaluation is performed by a domain expert (here: production engineer or process operator), making it possible to incorporate his (implicit, and therefore difficult to extract) knowledge into the procedure of process model development.
There is a whole range of methods used to build a classifier based on historical data. These include: neural networks, multiple regression, classification trees, and many others. Having analyzed the literature on the evaluation of the manufacturing process state, the authors chose a method that allows for the generation of rules, based on the dominance-based rough set approach (DRSA). The choice is a result of research conducted by the authors of the paper in 2010-2014 [12,13].

Dominance-based rough set approach (DRSA)
The choice of the DRSA method to create a classifier is directly linked with the features of the ductile iron casting technological process. State evaluation of such process may take discrete values, expressed both by figures and symbolically, from a pre-defined set, for example {1, 2, 3} or {good, bad}. As a result, the problem of multicriteria decisionmaking in automated process control comes down to the problem of assigning states and results of the process to predefined classes. In such a case it is necessary to incorporate the expert's knowledge into the procedure and to obtain a model of his preferred process states [14]. What is more, if the decisions on assigning a process state to a given class are made based on rules explicitly describing the dependency of the states on the symptom values of the state, it is recommended to diagnose the reasons for the condition.
The DRSA, a multicriteria decision making method based on the modeling of relations between the process states and the values of state symptoms taking the form of rules, which are introduced from the process data, meets all the above conditions. DRSA is an extension of the rough set theory [15,16,17].
In this method, just like in artificial neural networks [18], the knowledge is induced from the data which are examples of the process, and the data may include gaps and inconsistencies. However, in contrast to the neural networks, it takes the form of user-readable rules. Compared to fuzzy logic systems, the advantage of DRSA lies in the lack of need to discretizate the variable domains of the system and introduce additional assumptions, e.g. on the distribution of data fuzziness.
In the classical rough set theory, the discernibility relation is used to compare objects described by certain attributes. The relation is the basis for establishing a rough set that represents the concept of decision class by differentiating its lower and upper approximations. Objects, which unambiguously belong to the given decision class are contained in the lower approximation, and objects whose affiliation to the class may not be excluded, belong in the upper approximation of the set. This provides for an analysis of inconsistent data. The main difference between the classical rough set theory and DRSA involves the replacement of the discernibility relation by the relation of dominance.
From the point of view of applying DRSA in classification tasks, its most important abilities can be found in exploring domain information in the data taking into account the preference order in the domain of attributes and semantic correlation between the attributes, i.e. compliance with the dominance principle.
DRSA, like the classical theory, classifies the set into a certain number of disjoint decision classes, but the decision classes are organized in such a way that the higher the class number, the better the class, that is, in order of preference. As a result, the idea of a single class is replaced by clustering decision classes. A set of cases unambiguously belonging to the union of classes constitutes its lower approximation, and a set of cases which may belong to the union constitutes

A. Kujawińska et al. / JMM 52 (1) B (2016) 25 -34
27 Figure 2. The concept of supervised model building (with learning phase) [8] its upper approximation. The classical rough set theory based on the indiscernibility relation generates decision rules, which use solely the "=" relation. The rules generated by DRSA have a richer syntax, because they apply relationships of "≥", "≤" and "=". Thus the representation of knowledge in rules generated by DRSA is more synthetic. Moreover, contrary to the classical theory, DRSA does not require the discretization of quantitative attributes. The first step in applying DRSA in process control is the preparation of data for this process in the form of the so-called decision table (Fig. 3). The columns of the table include attribute values (measures describing the casting process) for its subsequent implementations (cases). The attributes are divided into conditional (the process measures), and decision-making (criteria for the process evaluation). The table rows represent subsequent cases of casting.
The actions of the second step result in the selection of criteria from the conditional attributes, in determining the direction of preference for them, determining the direction of preference for the decisive criterion, and in the transformation of conditional attributes to criteria by means of the attribute duplication technique (Fig. 3). Expert knowledge is necessary to properly prepare the data for analysis. Hence, at this stage of work on the model of manufacturing process, the analyst (knowledge engineer) who prepares the data must closely cooperate with the expert in the field (e.g. process engineer or technologist).
In the third step the properties of the decision table are analyzed, including in particular the analysis of inconsistencies in the data and the resulting quality classification, which leads to the induction of decision rules (Fig. 3). The stage is carried out iteratively.

Experimental
The DRSA method was applied to evaluate the cast iron casting process. The dataset relating to the actual process was collected in a foundry in Poland . The data describe the chemical composition and mechanical properties of the castings.

Data acquisition
In the studied foundry the following charge material was used: low-alloyed foundry pig iron, 20-30 %, -own cast iron scrap added to furnace as supplement, -foreign steel scrap, making up from 10-20 % of the charge, usually of unexamined composition.
After melting the charge, the metal was held at a temperature of ~ 1400 °C. After complete melting the chemical composition of the melt was measured. The spectrometer connected to a PC enabled the registration of the values of individual elements. Then spheroidizer was added to the bottom of the ladle, and its amount depended on the intended alloy and the temperature of the molten alloy. For cast iron with increased hardness the additive was either pure copper or modifier. During melting the temperature was measured when the metal was administered to the ladle for spheroidization, and in the casting ladle. Its accuracy was within ± 0.1 %. Measurement error in this case was related to the used instrument, the immersion thermocouple with the measuring range of 600-1800 o C [19].
The basic analysis of the chemical composition of the cast iron was registered, with nine alloying elements: carbon, silicon, manganese, phosphorus, sulphur, chromium, nickel, copper and magnesium. For the castings obtained from the melt a standard measurement of tensile strength was performed [19].  The studies resulted in over 900 data records. For some melts additional Al, Ti and Sn content was recorded, together with the spheroidization temperature [5]. This resulted in a data set containing 866 complete records (with the contents of 9 basic components of the melt). A fragment of a data set is shown in Table 1.
The data collected from the casts is characterized by the following values [21]: Tensile strength (Rm) was considered [22] the main aspect (criterion) for assessing the quality of the casting process. Tensile strength of the obtained cast iron changed in the range of 382-860 MPa. Such a wide range of values resulted from the class of ductile iron castings produced in the foundry: 400/18, 500/07 and 500/07 with increased hardness (obligation imposed by the customer). All tests were carried out on Y2 separately cast samples [21].

Distribution of Rm values into decision classes
Based on the measured Rm value, each record of the data set representing the measured cast iron was additionally assigned with a cast iron grade. The assignment was based on the values given in the Polish standard (PN-EN 1563:2000) and on the expertise. Four basic classes (categories) were distinguished, corresponding to [23]: -grade 400/18, -grade 500/07, -grade 500/07 with increased hardness, -no grade (unclassified grade, properties beyond the standard specification).
All the classes were assigned the designation 1, 2, 3 and 0, respectively.
Detailed terms of castings distribution into grades are shown in Table 2. The established conditions of casting assignment to different grades were the basis for the discretization of casting quality assessment criterion, i.e. the distribution of the continuous Rm value domain into classes. The class division assumed by the process expert was called "class division in the grade function". Table 3 shows the aspects (decision criteria) for assessing the quality of the casting process, with the distinguished decision classes and the principles of assigning the continuous value for a given class.
It was assumed that the assessment of the quality of casting process (state) is carried out either for tensile strength (Rm: 1, 2, 3) or from the point of view of the resulting cast iron grade (grade: 0, 1, 2, 3).

Preparation of data analyses
The data set about the process of casting collected in actual production conditions, and then supplemented by data on grades and tensile strength classes was forwarded to the authors in the form of decision table. To obtain iron casting process models from rule-based data, the data was prepared for analysis in accordance with the DRSA, which in general involved: -determining data types and domains for conditional attributes, -selecting the criteria from the conditional attributes, and determining the direction of preference for them, as well as determining the direction of preference for the decisive criterion -tensile strength, -transformation of conditional attributes (non-original criteria) to criteria by means of attribute duplication.
According to the domain knowledge it was assumed that all conditional attributes (i.e. 9 alloying elements), would be represented by positive real numbers (continuous data type). The values that the decision criterion may take correspond to the adopted decision classes. By design, they are always discreet.
The domain expert decided to determine the directions of preference for nine alloying elements (conditional attributes) with respect to tensile strength (Table 4). In the Table 4 GAIN means that the higher value of the attribute the better and COST the lower the better. In the literature the theoretical and experimental analysis justifying the selection on the example of the data set in question may be found [2].

A. Kujawińska et al. / JMM 52 (1) B (2016) 25 -34
29 It was also assumed that the decision maker wants to maximize the mechanical properties of the casting, so it was assumed that Rm is an attribute of the GAIN type.
In the next step of data preparation, the conditional attributes (non-original criteria) were transformed to criteria by means of attribute duplication techniques.
The decision table with preference information was saved in a format accepted by aMOPS software [20], in which the analysis was performed.
The names of conditional attributes used in the program files correspond to the symbols of the various alloying elements. If, however, the attribute was mirrored in the decision table, then in the files accepted by the software it took the name with the prefix "g" (from "gain"), while the corresponding attribute-clone took the name prefixed with "c" (from "cost"). That is why attributes duplicated in the records of the rules, coming directly from the aMOPS software, are pre-fixed with "c" and / or "g".
The next two steps in creating a rule-based classifier of ductile iron casting process are described in the next section.

The procedure for iterative stages of Data analysis and mining and Assessment of the set of rules ("model learning")
Data analysis and mining and Assessment of the (obtained) set of rules were carried out iteratively in order to obtain a model with satisfactory predictive properties, acceptable from the standpoint of a domain expert.
When assessing the predictive ability of the model the (general) classification accuracy and measurement sensitivity and precision for the less numerous class(es) were taken into account. The first iteration was performed for the whole set of conditional attributes (i.e. "the input set of attributes").
As the efficacy of classifying new cases by the model using the "input set of attributes" was not satisfactory, in the second iteration the set of conditional attributes was limited to a subset of attributes selected by the domain expert. The subset was defined as the "expert's subset of attributes" (ESA).
The ESA subset consists of 5 alloying elements: manganese, silicon, chromium, nickel, and copper (Mn, Si, Cr, Ni and Cu), which have the greatest impact on the microstructure and strength of ductile cast iron among the nine registered elements.
If the predictive abilities of the model obtained in the second iteration were not satisfactory, and ESA reducts existed, then the third iteration (or a group of iterations) of data analysis and mining and verification of the resulting set of rules would be run.
As a result of the three groups of iterations a decision was made to accept the selected model as the final model. The set of attributes which was the basis for building the final model is at the same time the outcome of the implicit attribute selection process, integrated into the iterative procedure of building a model of cast iron casting process.
This paper presents only the proceedings and conclusions on the final model, without describing in detail the properties of the model developed on the basis of all alloying elements, which had inferior properties.

Data analysis and induction of decision rules
The first step ("calculation") of data analysis and mining was aimed at designating the characteristics of the decision table, such as: approximation of class unions, classification quality, and, if the consistency level adopted in the iteration was equal to 1, reducts and a core set of attributes. Next, following the analysis of results, decision was taken to either continue the analysis with the selected parameter settings or discontinue it (leaving the given iteration and returning to the beginning of the stage).
The next step, after determining the basic characteristics of the decision table, was the induction of rules from the data contained in the decision table. Rule induction was made through the VC-DOML (Rule induction algorithm for variable consistency rough set approaches) algorithm implemented in the aMOPS software. The generated rules could be analyzed in detail based on assessment measures calculated for each of them, such as support, strength, coefficient of coverage, reliability.

A. Kujawińska et al. / JMM 52 (1) B (2016) 25 -34 30
The induced rules are the outcome of data analysis and mining. Next, they are evaluated in the stage Assessment of the set of rules (verification of the resulting model).

Assessment of the set of rules (crossvalidation)
At the stage of verification of the set of rules generated from the casting process data, the model assessment consisted of a formal evaluation of the effectiveness of classification of new cases by the set of rules, and the verification of the set of rules and its properties by a domain expert.
The effectiveness of classification by the obtained sets of rules was formally assessed by means of a cross-validation test. Due to the size of the data set (866 cases), 10-fold stratified cross validation was used. In order to obtain more reliable results (less dependent on the random distribution of cases in the learning and test sets), the validation was repeated 5times and the results were then averaged. In the test the (general) classification accuracy was determined, together with sensitivity and precision measures calculated (independently) for each class.
The role of a domain expert at the stage of model assessment consisted in general in accepting or rejecting a given set of rules as the final model. These decisions were taken primarily on the basis of the estimated predictive abilities of the model, but also on the basis of other properties of the rules (including the process knowledge they represented) and practical knowledge about the casting process and its actual conditions. The next section presents the detailed results of the work on rule-based models of the process of ductile iron casting.

Final model
As a result of the tuning of the model obtained with the expert's subset of conditional attributes, i.e. five alloying elements: {Mn, Si, Cr, Ni, Cu}, the consistency measure of μ (standard class unions) was adopted and the level of consistency was set at 0.93. Two classification strategies were used, with a forced assignment to the majority class, and with an acceptable lack of class assignment. The remainder of the chapter presents the properties of the resulting model.

Properties of decision table
The properties of decision table in strength assessment, after reducing its conditional attributes (criteria) to a set of five alloying elements (ESA), such as classification quality or accuracy of class union approximation, slightly worsened as compared to the original set of all attributes. The quality of classification is 0.86, and the accuracy of approximation for each class union is in the range of 0.520 to 0.956 ( Table 5).
The original total number of conditional attributes in the decision table for the classification of strength was 5. There was no need to duplicate attributes, because the scale of preference for all five elements is known. Due to the use of the integrity level of <1 (0.93) neither reducts nor core were determined from the decision table.

Rules
As a result of rule induction from the decision table 87 unambiguous decision rules were obtained, including 24 for the class union of "at least 3", 15 of the lower approximation of class union "at least 2", 15 of the lower approximation of class union "at most 1", and 33 from the lower approximation of the class union "at least 2". The part of rules set which is a model of casting strength is shown in Table 6.
The resulting set of rules, with the reclassification of the entire set of data (including objects from the lower and upper approximations of class unions) is characterized by classifying accuracy at 85.1% (with an error of 14.9 % and 0 % of nonclassification). Table 7 contains a matrix of errors obtained for the reclassification of the entire set of data using rules derived by applying the "new method of classification".
The analysis of the matrix of errors (Table 8) shows that a very small percentage of all errors corresponds to the incorrect assignment of objects originally belonging to class 1 (4 %, 5/129). In contrast, 21 % (44/211) of all cases originally belonging to class 2 and as many as 37 % (80/215) of cases from class 3 is misclassified. Cases from minority classes (2 and 3) are very often wrongly assigned by the set of rules. However, in the case of class 2 the error of assignment to class 1 (the majority class) dominates, while in the case of class 3 there is a dominating misclassification to class 2. For class 3 there is also the "by two classes" classification error (originally the object belonging to class 3 is assigned by the classifier to class 1).

A. Kujawińska et al. / JMM 52 (1) B (2016) 25 -34
A set of rules obtained during the data analysis and mining (step 2 of the methods) was assessed for the effectiveness of classification of new cases (step 3). Below are the averaged results of five 10-fold crossvalidation tests.

Cross-validation
For each of the classification strategies (with forced assignment to classes and acceptable non-classification) for the resulting set of 87 rules, the 10fold (stratified) cross-validation test was performed five times in order to obtain a more reliable result (average score). Tables 8 and 9 below show the basic results of the computational experiments.
All the calculations were made in the aMOPS software.
The tables show average values of overall classification accuracy as well as additional measures of classifier effectiveness, such as sensitivity and precision, calculated for each decision class.
Classification accuracy for the obtained model is approx. 78 % in case of forced assignment to majority class, and approx. 74 % in case of acceptable nonclassification.
For other measures evaluating the classification effectiveness, the lowest values can be seen for: in the case of forced class assignment -sensitivity for class 3 (54 %) and precision for class 2 (58 %), and for acceptable non-classification -sensitivity for class 3 (42 %) and precision for class 2 (57 %   suggest that for both classification strategies, class 3 is recognized most poorly (objects that belong to the class are mistakenly assigned to class 2).
The model obtained from the expert's subset of attributes (five alloying elements) shows better predictive abilities than the model obtained from all 9 alloying elements. This is demonstrated by higher general accuracy of classification (for both strategies, while in the case of acceptable non-classification the difference is very insignificant) and by higher total values of precision and sensitivity measures, obtained in the models. In addition, the model obtained from a smaller number of attributes is preferred as the final solution by a domain expert.

Conclusions
The main purpose of the article was to show an ability of DRSA method to assess the ductile iron casting process. The results of research prove that this method is suitable for industrial practice and can be useful in decision making process.
It can be concluded that the overall predictive ability of the final model for assessing the quality of the casting and casting process, expressed by the general accuracy of classification, is satisfactory for tensile strength (> 77 %).
Taking into account the consciously adopted restrictions of the considered number (and type) of casting process parameters (input variables of the model), the domain expert considered the result satisfactory.
It must be remembered and emphasized that casting processes are non-algorithm based, and actual relations that take place there are very difficult to grasp (examine, model). Moreover, the analyzed dataset took into account only the chemical composition of elements in the melt as the most important process parameters, disregarding a number of other parameters which have (or may have) an impact on the quality of the casting. In the light of the foregoing, the results obtained may be considered satisfactory.
It should also be noted that the resulting models are not universal models of ductile iron casting process, but apply only in the specific foundry, from which the learning data (data set) was taken. This shows the significance of Cu (copper); it was controlled in the casting shop (testing facility) to obtain the appropriate mechanical properties of the cast. The relation has been confirmed by the models, as it is a part of each subset used to build the final model.
In further research work possible incompleteness of data should be taken into account. One of the possible methods that can be considered is the use of the method proposed in [24,25].