Software Source Code: Theoretical Analyzing and Practical Reviewing Model

The high-quality software product is one of the most important goals of software engineering in general, source coding or implementation phase is one of the main steps of the software development process; when this phase is going in a good way and according to the source code quality standards and rules, the final product will be a high-quality product. This research focuses on the issues that can affect source code quality, such as errors, faults, defects, and failures; a practical system model will be provided (proposed) to predict some possible types of errors and defects and suggests some guidance and recommendations on source code error detection and analysis.

When software developers do mistakes and errors in early phases in lifecycle development process, the cost and effort of error discovering and maintain, in the implementation phase, will be very high. Any mistake in the requirement phase can lead to an error in the design, and this error surely will cause an error or defect in the source code implementation as shown in Figure1, subsequently it will be very difficult to detect all errors in the testing phase, because it is not easy and possible to consider all possible test cases during the testing. So it is possible for some serious errors in the software to remain undetected, and converted to defects then may lead to a general software system failure.

Background of the Research:
The ability to understand software product error characteristics is totally as important as understanding the cost and effort of software development for the development team. The software error's nature consists of the rate at which errors occur, the effort and cost of discovering and eliminate (removing) errors, the riskiness of the errors, the popular reasons causing errors, and the most effective methods to identify and avoiding errors(Bassman, M. J., McGarry, F., & Pajerski 1994).
Source code as the final product of the implementation phase can contain different types of mistakes and errors, the developers of this phase trying to deliver the free-error product as much as they can. They can eliminate syntax errors by compiling the source code using programming language compilers, it's a good step but not enough, many other types of errors, such logical, programming language's errors, and .... ect, can stay to be found in the testing phase.
In the testing phase, the testers are doing their best work to find and discover all errors to achieve the testing goal, but there are cases where some errors can hide other types of errors, and lead to unpredictable outputs; the developers, in this case, can never be sure that if the unexpected outputs caused by new errors or by the original errors (Sommerville 2016).
The process of identifying, fixing, and removing errors from source code (debugging and inspections) is one of the main goals of this research. To achieve this aim, the comprehensive analysis of different error types, defects, fault, and failure will be discussed.

Aims and Objective:
The purpose and the desired outcomes of this research is to provide an analytical and practical approach for understanding the nature of source code errors, and how to eliminate and remove them to produce high-quality source code through the following objectives: • To understand the nature of source code errors, defect and failures, and identify the relationship among them.
• To analyze the effects of each error type.
• To identify, fixing, and removing errors and defects from source code.
• To reduce cost effort of error detection.
• Provide recommendation and guidelines for evaluating source code.

Problem Statement
Producing high quality software product is highly affected by implementing high quality source code. The distinctive attributes and characteristics of source code are getting extremely influenced and touched by errors of the source code and by the internal errors of the programming language used in development the software product.

The Research Significance
This work has a very important impact as all software programmers and developers are encountering difficulties in defining and eliminating errors, defects, and failures in the development phase due to the nature and effects of source code errors, defect, and failures, and also the relationship among them, and its negative impact on the overall quality of software products.

Literature Review
This section includes the review of important works and concepts related to the source code errors, and software analyzing.

"Source Code Analysis -An Overview"(Kirkov and Agre 2010)
The "Source Code Analysis -An Overview"(Kirkov and Agre 2010) describes the algorithms, major structure, techniques, procedures, and experimental comparison of some tools used for software source code analysis. It illustrates the result of source code analysis using some tools to analyze program sample. The researchers in this work are suggesting using machine learning, and data mining for source code analyzing. Vol.12 No.6 (2021), 1554-1562

Source Code Size Effects on Errors
When the size of the software (source code) is increasing, the type and quantity of errors will be affected; more percentages of errors can be caused by previous software development phases like design and requirements as shown in Figure3 (McConnell 2004). Generally, the effort needed to code implementation and construction is about 65% for small software projects, and about 50% for medium software projects. The percentage of coding and construction errors, is increasing from 50% to 75% staring from small, to medium and large projects (McConnell 2004). The density and types of errors and defects will be changed according to size, the quality factors: maintainability, readability, testability also will be affected negatively.

Logical Error Detection using Object-Oriented Environment:
To avoid logic errors, the researchers in this work provided "an Object Behavior Environment (OBEnvironment)" (Samara 2017) that allow users to discover logic errors, it enforces the proper use of objects depending on their behaviors using software tools such "Xceed Component" that allow software developers to produce better codes, and writing correct statements and syntaxes by C# programming language.

A framework and methodology for studying the causes of software errors in programming systems (Kirkov and Agre 2010)
The research provides a framework that focuses on the identification and description of the error's cause in terms of constraints of cognitive breakdowns, this framework is built on new and old research of programming, and general studies on the techniques and mechanisms of human errors (Kirkov and Agre 2010).

Program Slicing
An increase in the number of output variables in the program increases the probability of error occurrence and decreases the performance of the program. Program slicing is one of important techniques used for source code analysis, it helps to break down source code; this issue makes the testing process easier and faster. There are different slice criteria used to identify a block of source code statements to test them separately without upsetting the rest parts of the program. The slicing increases the understandability, readability, and testability factors because it divides the source code into separate slices, then various testers can work on slices at the same time (Sasirekha, Robert, and Hemalatha 2011).

Methodology
This part includes some comprehensive discussion about the research methodology of analyzing source code errors, faults, defects, and failures, sampling and data collection.

Definitions
• Mistake: A user action that leads to producing invalid output.
• Software Error: The variation between software output and the true expected and corrected value (Ieee 1990). It can be produced by the developer mistake during the coding process.
• Defect: It can be called a silent (the killer in some cases, especially in real-time applications) error that causing to failure upon some conditions.
• Fault: it a wrong procedure or incorrect data type definition, also it called bug.
• Failure: It is an invalid result, or breaking and stopping software execution, it can happen if the system has a defect, or fault.
• Logical error: It is an error caused by incorrect logic steps (algorithm) used to solve problem.
• Run time error: It occurs during the run time of the program.
• Syntax error: It is incorrect using of the programming language syntax and it can appear during compilation or interpretation time.

Debugging
Debugging is very important series of actions and steps taken in order to fix errors that have been detected in testing process. Using source code testing results, programming experiences and knowledge, debuggers can discover the exact place or position of errors and their types, then repair them. Usually debuggers use interactive software tools to provide more information about source code running (Sommerville 2016).

Syntax and Syntax Errors
The syntax of programming languages is a set of rules and regulations used to write well-structured sentences to represent all parts of the source code including: variable and object declarations, expressions, commands, array, methods and procedures, and every single statement.
Any deviation in the syntax of source code expressions and structure can cause a simple error known as a syntax error; this type of error can be easily discovered during the compilation time. Generally, the syntax error is a grammatical error or mistake. For example missing bracket, or semicolon, or using small character instead of a capital case in some languages (Java).this type errors also known as compilation Errors (Samara 2017).

Logical Errors:
The software source code written in any computer (programming) language is a formal script constructed according to strict rules, after compilation, it should be free of all syntax errors, however, some errors remain even after the compilation process has completed successfully. These errors are called logical errors and they have many types, so "Logical Errors are errors that remain after all syntax errors have been removed" (Samara 2017). Normally, the compilation process does not discover or identify the presence or existence of the logical errors, this issue leads to unexpected results by the execution of the source code or the software uses this code.

Run Time Errors
This type of error cannot be discovered by compilation or interpretation, they occur during program run-time when the program is trying to perform a function or an expression that not able to execute, for an example infinite loops and division by zero, the run time errors usually causing system failure (Samara 2017).

Arithmetic error
Sometimes an arithmetic calculation can produce inaccurate results, causing an error or failure. This type or error called arithmetic error. The possible cases of arithmetic errors must be specified in the requirement specification, and also the actions and error handling against each error should be identified (Sommerville 2016). The Output for the following example is: 0.999999999999999, which is not correct as shown in figure 3.

Algorithmic error
This type of error occurs in the following cases: Connecting the branches of the algorithm in incorrect places. Test wrong condition. Not establishing variables and loops with correct initial values. Neglecting or forgetting to test a certain condition, such as the case of division by zero. Comparing variables with inappropriate data types for their definitions. This type of error includes grammatical errors of the programming language used, and these errors usually appear during the linking and translation operations.

Comments
To increase readability, testing, and maintenance factors, software developers must adhere to the rules for writing source code in accordance with international standards, especially naming the variables, identifiers, and choosing the appropriate data type commensurate with their use. Using comments (inline, and inside, and at the beginning of blocks and loops) correctly and not excessive in the number of lines of comments, because this may lead to opposite results according to the proposed mathematical relationship for calculating the maintenance factor (Al-qutaish and Al-qutaish 2010), which shows the sinusoidal relationship between the average number of comment lines and the size of the source code itself. Research Article Vol.12 No.6 (2021), [1554][1555][1556][1557][1558][1559][1560][1561][1562] Equation no (1)

Figure 4: Relationship: MI and % of Comments for 1000LOC
We notice from the figure 4 that increasing the percentage of code documentation increases the value of the maintenance indicator MI, but this increase will stop at a certain limit (the 37% point for a 1000-line code) because it follows sinusoidal behavior. Cyclomatic complexity and Halsted effort are decreasing MI also (see equation no1), all of these issues can make detection errors more difficult.

Error Handling and Exceptions
When the error occurs during program execution, this moment must be determined, otherwise, the error will turn into a fault, or a failure, this issue can cause to fail the entire software system. All programming languages support what called error handling and Exceptions for example Java, C, and C# are using the try and catch structure. Using exceptions and error handler will give an error indicator which can be used to take the correct decision during program run-time. If the error handling is processing the error in the right way, the programming system (software) can avoid the negative impact on the software behavior. For example division by zero problem can lead to very serious issue especially in real-time software applications. Figure 5 shows an example of using error exception handling to test instance variable of class constructor.

Proposed Source Code Analyzing Model:
The structure of most used systems (automated or semi-automated) for analyzing the software source code could be classified into 4 compound blocks: "model construction, analysis, and pattern recognition algorithms, patterns knowledge, and result representation" (Kirkov and Agre 2010)(Al-qutaish and Al-qutaish 2010) (Binkley and Binkley 2007), and other system such as "Optimization of Software Testing for Discrete Test suite using Genetic Algorithm and Sampling Technique"(PrasadaTripathy and Kanhar 2013). The proposed model uses different method as shown in the figure no 6, and general algorithm depending on the analyzing of operators, operands and relationship between them and data type declaration section as shown below.

The General Algorithm model:
1) Read source code file 2) Extract Declaration section, and Expressions (Operators and operands): for variable in declaration prepare list of operands and data type used for each one.
3) Error detection: a. For each equal (=) operator, check the data type for its left and right operands, If the right side has : • a plus (+) or multiplication (*) operator: If all operands from the same data type, overflow error can be happened, it has a defect, the recommendation in this case change data type of left side to include the data type of the right side, or add some restriction on the input data to be included in the needed range, for an example and integer data type included in the Long integer type, or double and float.
• a division (/) operator: In this case the division by zero problem must be processed.

Automated testing
Extract Declaration section, and Expressions (Operators and operands): for variable in declaration prepare list of operands and data type used for each one.  By comparing the data type for the output operand (Result) in the left side of the expression (Result= No1 + No2), we found that the left side is consisting of two operands and relation operator plus (+), so the balance of the data type must be in the integer range, in this case the list of recommendations will be generated to direct the user of this program to avoid expected errors including overflow error. Add direction message to inform user keeping data in the integer range. Enter data during run-time using Scanner class. Avoid entering data for both variables (No1, and No2) greater than absolute of half maximum of integer. Finally, try to use function with two parameters of double data type test accepted inputs. Non dismal calculations (Octal: an example) Avoid using (0) at beginning of entered numbers, by adding an expression to that. In case you are using your program to calculate Octal data numbering, try to enter numbers greater than 07. Accuracy In case of using loops with non-integer (double or float) counters, tack care of final result it will be not correct.

Lines of code
It's good in your program but the must be less than total number of the effective lines of code according to the quality standards. Summary It's better to use Object oriented programming to solve the above mentioned notes.