Split Loop


Refactoring contributed by Martin Fowler

You have a loop that is doing two things

Duplicate the loop


void printValues() {
        double averageAge = 0;
        double totalSalary = 0;
        for (int i = 0; i < people.length; i++) {
                        averageAge += people[i].age;
                        totalSalary += people[i].salary;
        }
        averageAge = averageAge / people.length;
        System.out.println(averageAge);
        System.out.println(totalSalary);
}




void printValues() {
        double totalSalary = 0;
        for (int i = 0; i < people.length; i++) {
                        totalSalary += people[i].salary;
        }

        double averageAge = 0;
        for (int i = 0; i < people.length; i++) {
                        averageAge += people[i].age;
        }
        averageAge = averageAge / people.length;

        System.out.println(averageAge);
        System.out.println(totalSalary);
}

Motivation


You often see loops that are doing two different things at once, because they can do that with one pass through a loop. Indeed most programmers would feel very uncomfortable with this refactoring as it forces you to execute the loop twice - which is double the work.

But like so many optimizations, doing two different things in one loop is less clear than doing them separately. It also causes problems for further refactoring as it introduces temps that get in the way of further refactorings. So while refactoring, don't be afraid to get rid of the loop. When you optimize, if the loop is slow that will show up and it would be right to slam the loops back together at that point. You may be surprised at how often the loop isn't a bottleneck, or how the later refactorings open up another, more powerful, optimization.

Mechanics

  • Copy the loop and remove the differing pieces from each loop

  • Compile and test.

  • Reorganize the lines to group the loop with related code from outside the loop

  • Compile and test.

  • Consider applying
    Extract Method
    or
    Replace Temp with Query
    on each loop


Example

    Start with this code.


            private Person [] people;

            void printValues() {
                    double averageAge = 0;
                    double totalSalary = 0;
                    for (int i = 0; i < people.length; i++) {
                                    averageAge += people[i].age;
                                    totalSalary += people[i].salary;
                    }
                    averageAge = averageAge / people.length;
                    System.out.println(averageAge);
                    System.out.println(totalSalary);
            }

    The loop is calculating both the total salary and the average age. This is two separate tasks, which just happen to use the same data structure. So I'll split the loops for now, knowing I can optimize later.

    The first move is to copy the loop and remove from each leg one part of the calculation.

            void printValues() {
                    double averageAge = 0;
                    double totalSalary = 0;
                    for (int i = 0; i < people.length; i++) {
                                    totalSalary += people[i].salary;
                    }
                    for (int i = 0; i < people.length; i++) {
                                    averageAge += people[i].age;
                    }
                    averageAge = averageAge / people.length;
                    System.out.println(averageAge);
                    System.out.println(totalSalary);
            }

    I can now compile and test this.

    Then I can reorganize the text to group things together.


            void printValues() {
                    double totalSalary = 0;
                    for (int i = 0; i < people.length; i++) {
                                    totalSalary += people[i].salary;
                    }

                    double averageAge = 0;
                    for (int i = 0; i < people.length; i++) {
                                    averageAge += people[i].age;
                    }
                    averageAge = averageAge / people.length;

                    System.out.println(averageAge);
                    System.out.println(totalSalary);
            }


    One thing that has bothered me about this refactoring is that there's a lot of duplicate loop setup code - and duplication is the worst smell of all. On the whole, however, I don't feel that concerned about it. Fundamentally the duplication is due to the godawful way that Java does loops. Other modern languages with a foreach statement avoid the duplication completely.


    Officialy that's the end of the refactoring. But the important thing here is what it leads you to do. In this case I'm inclined to apply Replace Temp with Query to each loop.


            void printValues() {
                    System.out.println(averageAge());
                    System.out.println(totalSalary());
            }

            private double averageAge() {
                    double result = 0;
                    for (int i = 0; i < people.length; i++) {
                                    result += people[i].age;
                    }
                    return result / people.length;
            }

            private double totalSalary() {
                    double result = 0;
                    for (int i = 0; i < people.length; i++) {
                                    result += people[i].salary;
                    }
                    return result;
            }



Additional Comments


The split loop is an often used performance
optimisation in data intensive applications. When you are accessing to
separate arrays in the same loop you can get hit badly by the cache
misses. That is, if the arrays are large enough you loose locality of
reference and the cache fails to do the job. Put enough code here and
every operation will not hit the cache and instead have to go back out
to main memory.


By splitting loops up into small separate components that only act on
one array you get significant performance increases. In rendering code
I've seen up to an order of magnitude difference. This is particularly
beneficial if it is primitive type based and less so if class reference
or pointer based where you need an additional dereference off into a
random memory location. This technique is also very similar to loop
unravelling for performance gains (although something I doubt would ever
appear in a refactoring/patterns/OO type book :)

--Justin Couch

Thanks to Bob Bowman for spotting and fixing an error.



***** 아름다운프로님에 의해서 게시물 복사 + 카테고리변경되었습니다 (2003-12-18 17:27)
Posted by 아름프로
BLOG main image

카테고리

분류 전체보기 (539)
이야기방 (19)
토론/정보/사설 (16)
IBM Rational (9)
U-IT (0)
SOA/WS/ebXML (110)
개발방법론/모델링 (122)
J2SE (34)
J2EE (60)
DataBase (39)
Open Projects (30)
BP/표준화 (50)
Apache Projects (15)
Web/보안/OS (22)
Tools (7)
AJAX/WEB2.0 (1)
Linux/Unix (1)
영어 (0)
비공개방 (0)

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

달력

«   2025/02   »
1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28

글 보관함

Total :
Today : Yesterday :