GHOP #x: Improve coder_format to re-format Drupal 6 core without wrong formattings

  1. <dt>Task description</dt>
  2. <dd>One of the things that has an great impact on Drupal's success is the widely adopted agreement on clean, sophisticated, and well-documented code. In an abstract view, we are not only using a language (PHP), but also using a dialect to articulate what we are doing. This means that core developers and most contrib module maintainers try to ensure that everyone else in the community understands all parts of the application and is able to fix or improve the code without language barriers. Without this solid ground, no one would be able to participate and contribute in the way the Drupal community members are doing. Many patches against Drupal core are holden back or even rejected because of unsuitable code.
  3. To support developers in using the proper syntax, formatting, naming, and adhering to all other development guidelines for Drupal, the <a href="/project/coder">Code Review</a> module has been developed, which outputs notices for all code that is considered wrong in terms of <a href="/coding-standards">Drupal Coding Standards</a>. While Coder does only a code review and leaves you with the results, there is also the coder_format script shipped with the Coder module, that strives to clean and re-format a whole file to adhere to Drupal's Coding Standards.
  4. Cleaning the coding-style of a contributed module with coder_format is usually a 5-minute job. Coder_format can be run recursively on all files in a directory by invoking it with the <code>--batch-replace</code> argument. See inline documentation in <code>coder_format.php</code> for further information.
  5. However, due to its current beta-state, there are situations where coder_format applies wrong formattings to some lines or all lines that follow some lines of complex syntax in a file.<ul>
  6. <li>The goal of this GHOP task is to fix and improve the coder_format script, so that it can be applied recursively on Drupal 6 core without introducing wrong formattings (and fixing existing wrong formattings throughout Drupal core, of course).</li>
  7. <li>The result of this GHOP task must be a single patch file against <code>coder_format.inc</code> that is ready to be committed. To clarify in advance: not a patch against Drupal core -- Patch files against Drupal core are only needed to communicate the current results/errors of coder_format in this issue.</li>
  8. <li>To work on this task, you should have a local development environment, be able to run a PHP script from the command line, know how to access a CVS server, and how to checkout and create patches against a CVS module. You should have a (better good) visual diff tool to compare a resulting file with the original.</li>
  9. <li>You will have to deal with rather complex regular expressions, Drupal Coding Standards, reviewing of Drupal core (with and without Coder module), creating and applying patches, writing helpful inline documentation, and identifying wrong formattings.</li>
  10. </ul>
  11. </dd>
  12. <dt>Resources</dt>
  13. <li>First of all, inline documentation in the <a href="http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/coder/scripts/coder_format/coder_format.inc?revision=1.2&amp;view=markup">coder_format script</a>.</li>
  14. <li><a href="/coding-standards">Drupal Coding Standards</a> (see also more up to date doc in <a href="http://cvs.drupal.org/viewvc.py/drupal/contributions/CODING_STANDARDS.html?content-type=text%2Fhtml&view=co">CVS</a>).</li>
  15. <li><a href="/project/issues/coder">Coder's issue queue</a> which contains some very valuable information about proper coding-styles (use Advanced Search to find older issues).</li>
  16. <li>PHP's <a href="http://php.net/manual/ref.tokenizer.php">Tokenizer</a> and <a href="http://php.net/manual/tokens.php">Tokenizer Tokens</a> which are used to identify PHP's syntax.</li>
  17. <li><a href="http://www.regular-expressions.info/tutorial.html" title="http://www.regular-expressions.info/tutorial.html">http://www.regular-expressions.info/tutorial.html</a> provides a great resource for regex-related questions and howto's, you will probably need.</li>
  18. <li>If you are working on Windows, you might additionally use the <a href="http://drupal.org/node/149826">coder_format Windows installer</a>, which I'll update with an improved version when this task has been claimed.</li>
  19. </ul>
  20. </dd>
  21. <dt>Primary contact</dt>
  22. <dd><a href="http://drupal.org/user/54136">Daniel F. Kudwien (sun)</a>
  23. I'm pretty sure that fellow Drupallers like Doug, Stella, Core developers, subscribers of the <a href="http://groups.drupal.org/coding-standards-and-performance-optimization">Coding Standards Group</a> and also developers in the #drupal and #drupal-ghop IRC channels will help out on any questions related to missing information about Drupal Coding Standards.
  24. </dd>
  25. </dl>
  26. Please bear in mind that coder_format is already performing quite well. So the referenced resources about coding standards are only necessary to find out how something should look like <em>if there are any</em> wrong formattings. It is quite hard to believe that the attached patch is already based on a completely re-formatted Drupal core, which means that all files have been re-written by coder_format.
  27. Pre- and post-processors are already part of the script, so the mentioned post-processor for cleaning up lines containing only white-space is definitely not a big deal. Also, the mentioned bug in <code>drupal_urlencode()</code> is clearly caused by the pre-processor <code>coder_preprocessor_inline_comment()</code>.
  28.  
  29. For the sake of completeness I'm adding a patch file against Drupal 6 core that is the result of coder_format. It's about 1,5 MB, so I needed to zip it. You will discover the first formatting error in <code>drupal_urlencode()</code> which leads to wrong indents in all following lines. Furthermore, the patch would be dramatically smaller, if there was a post-processor that removed any white-space from all lines containing only white-space (this is one of the needed improvements).