unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Web site news to move to Haunt
@ 2016-10-26  8:09 Ludovic Courtès
  2016-10-26 17:54 ` Christopher Allan Webber
  2016-10-27 23:33 ` Ludovic Courtès
  0 siblings, 2 replies; 6+ messages in thread
From: Ludovic Courtès @ 2016-10-26  8:09 UTC (permalink / raw)
  To: guix-devel

Hello Guix!

In the ‘wip-haunt’ branch of guix-artwork.git, I started switching the
web site to use Haunt, as was suggested long ago:

  https://lists.gnu.org/archive/html/guix-devel/2015-11/msg00087.html

The goal is to manage news using Haunt, and to have a page to display
them on the web site, something nicer and more flexible than this:

  https://savannah.gnu.org/news/?group=guix

And I already enjoy Haunt!  :-)

Hopefully I’ll be done real soon and the web site will become the
official news host (I’ll have to email “planet” maintainers to update
the feed URL.)

I have a couple of procedures to import the old news items¹ and to write
them as alists with SXML suitable for input to Haunt.

Ludo’.

¹ Actually only the 20 most recent items, due to this hard-coded limit
  in Savane:
  <http://git.savannah.gnu.org/cgit/administration/savane.git/tree/frontend/php/news/atom.php#n61>.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Web site news to move to Haunt
  2016-10-26  8:09 Web site news to move to Haunt Ludovic Courtès
@ 2016-10-26 17:54 ` Christopher Allan Webber
  2016-10-27 23:33 ` Ludovic Courtès
  1 sibling, 0 replies; 6+ messages in thread
From: Christopher Allan Webber @ 2016-10-26 17:54 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

Ludovic Courtès writes:

> Hello Guix!
>
> In the ‘wip-haunt’ branch of guix-artwork.git, I started switching the
> web site to use Haunt, as was suggested long ago:
>
>   https://lists.gnu.org/archive/html/guix-devel/2015-11/msg00087.html
>
> The goal is to manage news using Haunt, and to have a page to display
> them on the web site, something nicer and more flexible than this:
>
>   https://savannah.gnu.org/news/?group=guix
>
> And I already enjoy Haunt!  :-)

Horray!  Haunt is great, and it is nice to see more users... means we
can collaboratively help tune Haunt to our collective needs more easily.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Web site news to move to Haunt
  2016-10-26  8:09 Web site news to move to Haunt Ludovic Courtès
  2016-10-26 17:54 ` Christopher Allan Webber
@ 2016-10-27 23:33 ` Ludovic Courtès
  2016-10-28  7:28   ` ng0
                     ` (2 more replies)
  1 sibling, 3 replies; 6+ messages in thread
From: Ludovic Courtès @ 2016-10-27 23:33 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 1654 bytes --]

Hello Guix!

ludo@gnu.org (Ludovic Courtès) skribis:

> The goal is to manage news using Haunt, and to have a page to display
> them on the web site, something nicer and more flexible than this:
>
>   https://savannah.gnu.org/news/?group=guix
>
> And I already enjoy Haunt!  :-)

I’m done!

Please visit the news pages and report anything wrong!  Also, please
test the feed.xml file; currently IceCat reports that it “does not
appear to have any style information associated with it”, and thus
doesn’t display it nicely.  Maybe Haunt omits to emit (ah ha!) the
doctype thingie?

Because Savannah’s feeds only include the 20 newest entries¹, Assaf
Gordon, who works on Savannah, kindly exported the raw Guix news
database for me (54 entries).

The database contains the input markup, not the XML that appears in the
feed.  Of course the markup is nothing like Markdown, CommonMark, etc.

So I considered writing a parser in Scheme, but that turned out to be
trickier than I thought, so I turned back to running Savane’s own parser
(PHP!).  Since I don’t have PHP on Guix, I ran it on another machine
where it’s installed: PHP as a service.  And since I couldn’t run all of
Savane, I just hacked together ‘markup.php’ (attached).  High-quality
work.

And then I reused the atom-parsing code that used to be in our (www)
module to get the damn entries and emit them as SXML for Haunt (file
attached).  Phewww.

I think we should do the same for Guile real soon.

Anyway, big thanks to Assaf!

Ludo’.

¹ https://lists.gnu.org/archive/html/savannah-hackers-public/2016-10/msg00017.html


[-- Attachment #2: markup.php --]
[-- Type: application/octet-stream, Size: 25262 bytes --]

<?php
# <one line to give a brief idea of what this does.>
# 
#  Copyright 2005-2006 (c) Tobias Toedter <t.toedter--gmx.net>
#                          Mathieu Roy <yeupou--gnu.org>
# 
# This file is part of Savane.
# 
# Savane is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as
# published by the Free Software Foundation, either version 3 of the
# License, or (at your option) any later version.
# 
# Savane is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU Affero General Public License for more details.
# 
# You should have received a copy of the GNU Affero General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.

## Provides functions to allow users to format the text in a secure way:
##    markup_basic() for very light formatting
##    markup_rich() for formatting excepting headers
##    markup_full() for full formatting, including headers

require_once('utils.php');


## Will tell the user what is the level of markup available in a uniformized
# way. 
# Takes as argument the level, being full / rich / basic / none
# To avoid making page looking strange, we will put that only on textarea
# where it is supposed to be the most useful
function markup_info($level, $additionnal_string=false)
{
  if ($level == 'basic')
    {
      $string = _("Basic Markup");
      $text = _("Only basic text tags are available in this input field.");
    }
  elseif ($level == 'rich')
    {
      $string = _("Rich Markup");
      $text = _("Rich and basic text tags are available in this input field.");      
    }
  elseif ($level == 'full') 
    {
      $string = _("Full Markup");
      $text = _("Every tags are available in this input field.");      
    }
  elseif ($level == 'none')
    {
      $string = _("No Markup");
      $text = _("No tags are available in this input field.");    
    }

  if ($level != 'none')
    {
      $text .= " "._("Check the Markup Reminder in Related Recipes for a description of these tags.");
    }

  return '<span class="smaller">('.utils_help('<img src="'.$GLOBALS['sys_home'].'images/'.SV_THEME.'.theme/misc/edit.png" border="0" class="icon" alt="'.$string.'" />'.$string, 
		    $text,
		    true).$additionnal_string.')</span>';
}


##
# Converts special markup characters in the input text to real HTML
#
# The following syntax is supported:
# * *word* -> <strong>word</strong>
# * _word_ -> <em>word</em>
# * [http://gna.org/] -> <a href="http://gna.org/">http://gna.org/</a>
# * [http://gna.org/ text] -> <a href="http://gna.org/">text</a>
# * (bug|task|...) #1234 -> Link to corresponding page
#
function markup_basic($text)
{
  $lines = explode("\n", $text);
  $result = array();

  foreach ($lines as $line)
    {
      $result[] = _markup_inline($line);
    }

  return join("\n", $result);
}



##
# Converts special markup characters in the input text to real HTML
#
# This function does the same markup as utils_basic_markup(), plus
# it supports the following:
# * paragraphs
# * lists (<ul> and <ol>)
# * nested lists
# * horizontal rulers
#
function markup_rich($text)
{
  return markup_full($text, false);
}



##
# Converts special markup characters in the input text to real HTML
#
# This function does the same markup as utils_rich_markup(), plus
# it converts headings to <h3> ... <h6>
#
function markup_full($text, $allow_headings=true)
{
  $lines = explode("\n", $text);
  $result = array();

  # we use a stack (last in, first out) to track the current
  # context (paragraph, lists) so we can correctly close tags
  $context_stack = array();

  $quoted_text = false;
  $verbatim = false;
  # $printer = array(1, 2);
  foreach ($lines as $index => $line)
    {

      # the verbatim tags are not allowed to be nested, because
      # they are translated to HTML <textarea> (<pre> in printer mode),
      # which in turn is also
      # not allowed to be nested.
      # therefore, we dont need a counter of the level, but
      # a simple bool flag
      # We also need to bufferize the verbatim content, as we want to now
      # its exact number of lines
      #
      # yeupou, 2006-10-31: we need a verbatim count, because actually 
      # we may want to put at least one verbatim block into another, for
      # instance in the recipe that explain the verbatim tag
      if (preg_match('/([+]verbatim[+])/', $line) and !$verbatim)
        {
          $verbatim = 1;  
	  $verbatim_buffer = '';
	  $verbatim_buffer_linecount = 0;

	  $line = join("\n", $context_stack);

	  if (empty($printer))
	    { array_unshift($context_stack, '</textarea>'); }
	  else
	    { array_unshift($context_stack, '</pre>'); }
	  
	  # Jump to the next line, assuming that we can ignore the rest of the
	  # line
	  continue;
        }

      # Increment the verbatim count if we find a verbatim closing in a 
      # verbatim environment
      if (preg_match('/([+]verbatim[+])/', $line) and $verbatim)
        { $verbatim++; }

      if (preg_match('/([-]verbatim[-])/', $line) and $verbatim == 1)
        {
          $verbatim = false;

          $line = join("\n", $context_stack);
          array_shift($context_stack);

          #array_pop($result); # no longer useful since we bufferize verbatim
	  if (empty($printer))
	    {
	      # Limit the textarea to 20 lines
	      if ($verbatim_buffer_linecount > 20)
		{ $verbatim_buffer_linecount = 20; }

	      # Use a text input if it is not multiline
	      if ($verbatim_buffer_linecount < 2)
		{
		  $result[] = '<input type="text" class="verbatim" readonly="readonly" size="60" value="'.$verbatim_buffer.'" />';
		}
	      else
		{		  
		  $result[] = '<textarea class="verbatim" readonly="readonly" rows="'.$verbatim_buffer_linecount.'" cols="80">'.$verbatim_buffer.'</textarea>';
		}
	    }
	  else
	    {
	      $result[] = '<pre class="verbatim">'.$verbatim_buffer.'</pre>';
	    }
	  $verbatim_buffer = '';
	  $verbatim_buffer_linecount = 0;
	  
	  # Jump to the next line, assuming that we can ignore the rest of the
	  # line
	  continue;
        }

      # Decrement the verbatim count if we find a verbatim closing in a 
      # verbatim environment
      if (preg_match('/([-]verbatim[-])/', $line) and $verbatim > 1)
	  { $verbatim--; }

      # if we're in the verbatim markup, don't apply the markup
      if ($verbatim)
        {
          # disable the +nomarkup+ tags by inserting a unique string.
          # this has to be done in the original string, because that
          # is the one which will be split upon the +nomarkup+ tags,
          # see below
          $escaped_line = str_replace('nomarkup',
            'no-1a4f67a7-4eae-4aa1-a2ef-eecd8af6a997-markup', $line);
          $lines[$index] = $escaped_line;
          $verbatim_buffer .= $escaped_line . "\r";
	  $verbatim_buffer_linecount++;
        }
      else
        {
	  # Otherwise, normal run, do the markup
          $result[] = _full_markup($line, $allow_headings, $context_stack, $quoted_text);	  
	}

    }

  # make sure that all previously used contexts get their
  # proper closing tag by merging in the last closing tags
  $markup_text = join("\n", array_merge($result, $context_stack));

  # its easiest to markup everything, without supporting the nomarkup
  # tag. afterwards, we replace every nomarkup tag pair with the content
  # between those tags in the original string
  $original = preg_split('/([+-]nomarkup[+-])/', join("\n", $lines), -1,
    PREG_SPLIT_DELIM_CAPTURE);
  $markup = preg_split('/([+-]nomarkup[+-])/', $markup_text, -1,
    PREG_SPLIT_DELIM_CAPTURE);
  # save the HTML tags from the last element in the markup array, see below
  $last_tags = $markup[count($markup)-1];
  $nomarkup_level = 0;

  foreach ($original as $index => $original_text)
    {
      # keep track of nomarkup tags
      if ($original_text == '+nomarkup+') $nomarkup_level++;
      if ($original_text == '-nomarkup-') $nomarkup_level--;

      # if the current match is the nomarkup tag, we don't want it to
      # show up in the markup text -> set it to an empty string
      if (preg_match('/([+-]nomarkup[+-])/', $original_text))
        {
          $markup[$index] = '';
          $original_text = '';
        }
      # while we're in a nomarkup environment, the already marked up text
      # needs to be replaced with the original content. Also, we need
      # to add <br />  tags for newlines.
      if ($nomarkup_level > 0)
        {
          $markup[$index] = nl2br($original_text);
        }
    }

  # normally, $nomarkup_level must be zero at this point. however, if
  # the user submits wrong markup and forgets to close the -nomarkup-
  # tag, we need to take care of that.
  # To do this, we need to look for closing tags which have been deleted.
  if ($nomarkup_level > 0)
    {
      $trailing_markup = array_reverse(split("\n", $last_tags));
      $restored_tags = '';
      foreach ($trailing_markup as $tag)
        {
          if (preg_match('/^\s*<\/[a-z]+>$/', $tag))
            {
              $restored_tags = "\n$tag$restored_tags";
            }
          else
            {
              $markup[] = $restored_tags;
              break;
            }
        }
    }

  # lastly, revert the escaping of +nomarkup+ tags done above
  # for verbatim environments
  return str_replace('no-1a4f67a7-4eae-4aa1-a2ef-eecd8af6a997-markup',
    'nomarkup', join('', $markup));
}


# Convert whatever content that can contain markup to a valid text output
# It wont touch what seems to be valid in text already, or what cannot
# be converted in a very satisfactory way.
# This function should be minimal, just to avoid weird things, not to do
# very fancy things.
function markup_textoutput ($text)
{
  $lines = explode("\n", $text);
  $result = array();

  $protocols = "https?|ftp|sftp|file|afs|nfs";
  $savane_tags = "verbatim|nomarkup";

  foreach ($lines as $line)
    {
      # Handle named hyperlink.
      $line = 
	preg_replace(
              # find the opening brace '['
		     '/\['
              # followed by the protocol, either http:// or https://
		     .'(('.$protocols.'):\/\/'
              # match any character except whitespace or the closing
              # brace ']' for the actual link
		     .'[^\s\]]+)'
              # followed by at least one whitespace
		     .'\s+'
              # followed by any character (non-greedy) and the
              # next closing brace ']'
		     .'(.+?)\]/', '$3 <$1>', $line);
      
      # Remove savane-specific tags
      $line = preg_replace('/\+('.$savane_tags.')\+/', '', $line);
      $line = preg_replace('/\-('.$savane_tags.')\-/', '', $line);
      $result[] = $line;
      
    }

  return join("\n", $result);
}


##
# Internal function for recognizing and formatting special markup
# characters in the input line to real HTML
#
# This function is a helper for utils_full_markup() and should
# not be used otherwise.
#
function _full_markup($line, $allow_headings, &$context_stack, &$quoted_text)
{
  #############################################################
  # context formatting
  #
  # the code below marks up recognized special characters,
  # by starting a new context (e.g. headings and lists)
  #############################################################

  # generally, we want to start a new paragraph. this will be set
  # to false, if a new paragraph is no longer appropriate, like
  # for headings or lists
  $start_paragraph = true;

  # Match the headings, e.g. === heading ===
  if ($allow_headings)
    {
      $line = _markup_headings($line, $context_stack, $start_paragraph);
    }

  # Match list items
  $line = _markup_lists($line, $context_stack, $start_paragraph);

  # replace four '-' sign with a horizontal ruler
  if (preg_match('/^----\s*$/', $line))
    {
      $line = join("\n", $context_stack).'<hr />';
      $context_stack = array();
      $start_paragraph = false;
    }

  #############################################################
  # inline formatting
  #
  # the code below marks up recognized special characters,
  # without starting a new context (e.g. <strong> and <em>)
  #############################################################

  $line = _markup_inline($line);

  #############################################################
  # paragraph formatting
  #
  # the code below is responsible for doing the Right Thing(tm)
  # by either starting a new paragraph and closing any previous
  # context or continuing an existing paragraph
  #############################################################

  # change the quoteing mode when the line start with '>'
  if (substr($line, 0, 4) == '&gt;')
    {
      # if the previous line was not quoted, start a new quote paragraph
      if (!$quoted_text)
        {
          $line = join("\n", $context_stack)."<p class=\"quote\">$line";
          # empty the stack
          $context_stack = array('</p>');
          $start_paragraph = false;
        }
      $quoted_text = true;
    }
  else
    {
      # if the previous line was quoted, end the quote paragraph
      if ($quoted_text and $start_paragraph and $line != '')
        {
          $line = join("\n", $context_stack)."\n<p>$line";
          # empty the stack
          $context_stack = array('</p>');
        }
      $quoted_text = false;
    }

  # don't start a new paragraph again, if we already did that
  if (isset($context_stack[0]) && $context_stack[0] == '</p>')
    {
      $start_paragraph = false;
    }

  # add proper closing tags when we encounter an empty line.
  # note that there might be no closing tags, in this case
  # the line will remain emtpy.
  if (preg_match('/^(|\s*)$/', $line))
    {
      $line = join("\n", $context_stack)."$line";
      # empty the stack
      $context_stack = array();
      $start_paragraph = false;
    }

  # Finally start a new paragraph if appropriate
  if ($start_paragraph)
    {
      # make sure that all previously used contexts get their
      # proper closing tag
      $line = join("\n", $context_stack)."<p>$line";
      # empty the stack
      $context_stack = array('</p>');
    }

  # append a linebreak while in paragraph mode
  if (isset($context_stack[0]) && $context_stack[0] == '</p>')
    {
      $line .= '<br />';
    }

  return $line;
}



##
# Internal function for recognizing and formatting headings
#
# This function is a helper for _full_markup() and should
# not be used otherwise.
#
function _markup_headings($line, &$context_stack, &$start_paragraph)
{
  if (preg_match(
    # find one to four '=' signs at the start of a line
    '/^(={1,4})'
    # followed by exactly one space
    .' '
    # followed by any character
    .'(.+)'
    # followed by exactly one space
    .' '
    # followed by one to four '=' signs at the end of a line (whitespace allowed)
    .'(={1,4})\s*$/', $line, $matches))
    {
      $header_level_start = max(min(strlen($matches[1]), 4), 1);
      $header_level_end = strlen($matches[3]);
      if ($header_level_start == $header_level_end)
        {
          # if the user types '= heading =' (one '=' sign), it will
          # actually be rendered as a level 3 heading <h3>
          $header_level_start += 2;
          $header_level_end += 2;

          $line = "<h$header_level_start>$matches[2]</h$header_level_end>";
          # make sure that all previously used contexts get their
          # proper closing tag
          $line = join("\n", $context_stack).$line;
          # empty the stack
          $context_stack = array();
          $start_paragraph = false;
        }
    }
  return $line;
}



##
# Internal function for recognizing and formatting lists
#
# This function is a helper for _full_markup() and should
# not be used otherwise.
#
function _markup_lists($line, &$context_stack, &$start_paragraph)
{
  if (preg_match('/^\s?([*0]+) (.+)$/', $line, $matches))
    {
      # determine the list level currently in use
      $current_list_level = 0;
      foreach ($context_stack as $context)
        {
          if ($context == '</ul>' or $context == '</ol>')
            {
              $current_list_level++;
            }
        }

      # determine whether the user list levels match the list
      # level we have in our context stack
      #
      # this will catch (potential) errors of the following form:
      # * list start
      # 0 maybe wrong list character
      # * list end
      $markup_position = 0;
      foreach (array_reverse($context_stack) as $context)
        {
          # we only care for the list types
          if ($context != '</ul>' and $context != '</ol>')
            {
              continue;
            }

          $markup_character = substr($matches[1], $markup_position, 1);

          if (($markup_character === '*' and $context != '</ul>')
            or ($markup_character === '0' and $context != '</ol>'))
            {
              # force a new and clean list start
              $current_list_level = 0;
              break;
            }
          else
            {
              $markup_position++;
            }
        }

      # if we are not in a list, close the previous context
      $line = '';
      if ($current_list_level == 0)
        {
          $line = join("\n", $context_stack);
          $context_stack = array();
        }

      # determine the list level the user wanted
      $wanted_list_level = strlen($matches[1]);

      # here we start a new list and make sure that the markup
      # is valid, even if the user did skip one or more list levels
      $list_level_counter = $current_list_level;
      while ($list_level_counter < $wanted_list_level)
        {
          switch (substr($matches[1], $list_level_counter, 1))
            {
              case '*':
                $tag = 'ul';
                break;
              case '0':
                $tag = 'ol';
                break;
            }
          $line .= "<$tag>\n<li>";
          array_unshift($context_stack, "</$tag>");
          array_unshift($context_stack, "</li>");
          $list_level_counter++;
        }

      # here we end a previous list and make sure that the markup
      # is valid, even if the user did skip one or more list levels
      $list_level_counter = $current_list_level;
      while ($list_level_counter > $wanted_list_level)
        {
          $line .= array_shift($context_stack)."\n"
            .array_shift($context_stack)."\n";
          $list_level_counter--;
        }

      # prepare the next item of the same list level
      if ($current_list_level >= $wanted_list_level)
        {
          $line .= "</li>\n<li>";
        }

      # finally, append the list item
      $line .= $matches[2];
      $start_paragraph = false;
    }
  return $line;
}



##
# Internal function for recognizing and formatting inline tags and links
#
# This function is a helper for _full_markup() and should
# not be used otherwise.
#
function _markup_inline($line)
{
  # Group_id may be necessary for recipe #nnn links
  global $group_id;

  $comingfrom = '';
  if ($group_id)
    {
      $comingfrom = "&amp;comingfrom=$group_id";
    }

  if (strlen($line) == 0)
    {
      return;
    }

  # Regexp of protocols supported in hyperlinks (should be protocols that
  # we can expect web browsers to support)
  $protocols = "https?|ftp|sftp|file|afs|nfs";

  # Prepare usual links: prefix "www." with "http://"
  # if it is preceded by [ or whitespace or at the beginning of line.
  # (don't want to prefix in cases like "//www.." or "ngwww...")
  $line = preg_replace('/(^|\s|\[)(www\.)/i', '$1http://$2', $line); 

  # replace the @ sign with an HTML entity, if it is used within
  # an url (e.g. for pointers to mailing lists). This way, the
  # @ sign doesn't get mangled in the e-mail markup code
  # below. See bug #2689 on http://gna.org/ for reference.
  $line = eregi_replace("([a-z]+://[^<>[:space:]]+)@", "\\1&#64;", $line);

  # Prepare the markup for normal links, e.g. http://test.org, by
  # surrounding them with braces []
  # (& = begin of html entities, it means a end of string unless
  # it is &amp; which itself is the entity for &)
  $line = preg_replace('/(^|\s|[^\[])(('.$protocols.'):\/\/(&amp;|[^\s&]+[a-z0-9\/^])+)/i',
    '$1[$2]', $line);

  # do a markup for mail links, e.g. info@support.org
  # (do not use utils_emails, this does extensive database
  # search on the string
  # and replace addresses in several fashion. Here we just want to make
  # a link). Make sure that 'cvs -d:pserver:anonymous@cvs.sv.gnu.org:/...'
  # is NOT replaced.
  $line = preg_replace("/(^|\s)([a-z0-9_+-.]+@([a-z0-9_+-]+\.)+[a-z]+)(\s|$)/i",
		       '\1' . utils_email_basic('\2') . '\4', $line);

  # Links between items
  # FIXME: it should be i18n, but in a clever way, meaning that everytime
  # a form is submitted with such string, the string get converted in
  # english so we always get the links found without having a regexp
  # including every possible language.
  $trackers = array (
      "bugs?" => "bugs/?",
      "support|sr" => "support/?",
      "tasks?" => "task/?",
      "recipes?|rcp" => "cookbook/?func=detailitem$comingfrom&amp;item_id=",
      "patch" => "patch/?",
      # In this case, we make the link pointing to support, it wont matter,
      # the download page is in every tracker and does not check if the tracker
      # is actually used
      "files?" => "support/download.php?file_id=",
  );
  foreach ($trackers as $regexp => $link)
    {
      # Allows only two white space between the string and the numeric id
      # to avoid having too time consuming regexp. People just have to pay
      # attention.
      $line = preg_replace("/(^|\s|\W)($regexp)\s{0,2}#([0-9]+)/i",
        '$1<em><a href="'.$GLOBALS['sys_home']
        .$link.'$3">$2&nbsp;#$3</a></em>', $line);
    }

  # add an internal link for comments
  $line = preg_replace('/(comments?)\s{0,2}#([0-9]+)/i',
    '<em><a href="#comment$2">$1&nbsp;#$2</a></em>', $line);

  # Add support for named hyperlinks, e.g.
  # [http://gna.org/ Text] -> <a href="http://gna.org/">Text</a>
  $line = preg_replace(
    # find the opening brace '['
    '/\['
    # followed by the protocol, either http:// or https://
    .'(('.$protocols.'):\/\/'
    # match any character except whitespace or the closing
    # brace ']' for the actual link
    .'[^\s\]]+)'
    # followed by at least one whitespace
    .'\s+'
    # followed by any character (non-greedy) and the
    # next closing brace ']'
    .'(.+?)\]/', '<a href="$1">$3</a>', $line);
 
  # Add support for unnamed hyperlinks, e.g.
  # [http://gna.org/] -> <a href="http://gna.org/">http://gna.org/</a> 
  # We make sure the lenght of the string is not too long, otherwise we cut
  # it.
  # (Supposedly, preg_replace_callback is faster than preg_replace //e but
  # it seems less reliable)
  $line = preg_replace(
    # find the opening brace '['
    '/\['
    # followed by the protocol, either http:// or https://
    # (FIXME: which protocol does it makes sense to support, which one
    # should we ignore?)
    .'(('.$protocols.'):\/\/'
    # match any character except whitespace (non-greedy) for
    # the actual link, followed by the closing brace ']'
    .'[^\s]+?)\]/e', "utils_cutlink('$1')", $line);

  # *word* -> <strong>word</strong>
  $line = preg_replace(
    # find an asterisk
    '/\*'
    # then one character (except a space or asterisk)
    .'([^* ]'
    # then (optionally) any character except asterisk
    .'[^*]*?)'
    # then an asterisk
    .'\*/', '<strong>$1</strong>', $line);

  # _word_ -> <em>word</em>
  $line = preg_replace(
    # allow for the pattern to start at the beginning of a line.
    # if it doesn't start there, the character before the slash
    # must be either whitespace or the closing brace '>', to
    # allow for nested html tags (e.g. <p>_markup_</p>).
    # Additionally, the opening brace may appear.
    # See bug #10571 on http://gna.org/ for reference.
    '/(^|\s+|>|\()'
    # match the underscore
    .'_'
    # match any character (non-greedy)
    .'(.+?)'
    # match the ending underscore and either end of line or
    # a non-word character
    .'_(\W|$)/', '$1<em>$2</em>$3', $line);

  return $line;
}

print '<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <id>'.$id.'</id>
  <link rel="self" href="'.$myself.'"/>
  <title>'.$title.'</title>
  <updated>'.$last_updated.'</updated>

';

$stdin = fopen("php://stdin","r"); 
while ($row = fgetcsv($stdin, 10000000, "\t"))
{
  $id = "http://$sys_default_domain{$sys_home}forum/forum.php?forum_id={$row[0]}";
  $title = $row[4];
  $updated = date('c', $row[2]);
  $author = $row[1];
  $content = markup_full(trim(str_replace(array("\\n","\\t"), array("\n","\t"), $row[5])));

  print "
  <entry>
    <id>$id</id>
    <link rel='alternate' href='$id'/>
    <title>$title</title>
    <updated>$updated</updated>
    <author>
      <name>$author</name>
    </author>
    <content type='xhtml' xml:base='$id'>
      <div xmlns='http://www.w3.org/1999/xhtml'>$content</div>
    </content>
  </entry>
";
}

// Feed footer
print "</feed>";

[-- Attachment #3: oldnews.scm --]
[-- Type: application/octet-stream, Size: 4620 bytes --]

(define-module (oldnews)
  #:use-module (ice-9 match)
  #:use-module (ice-9 pretty-print)
  #:use-module (srfi srfi-1)
  #:use-module (srfi srfi-9)
  #:use-module (srfi srfi-11)
  #:use-module (srfi srfi-19)
  #:use-module (srfi srfi-26)
  #:use-module (web client)
  #:use-module (sxml match)
  #:use-module (sxml simple))

(define %atom-url
  ;; The web site's news feed.
  "http://savannah.gnu.org/news/atom.php?group=guix")

(define (atom->sxml port)
  (xml->sxml port
             #:namespaces '((atom . "http://www.w3.org/2005/Atom")
                            (xhtml . "http://www.w3.org/1999/xhtml"))
             #:trim-whitespace? #t))

(define (fetch-news)
  "Return the SXML tree of the Atom news feed."
  (call-with-values
      (lambda ()
        (http-get %atom-url))
    (lambda (response contents)
      (call-with-input-string contents atom->sxml))))

(define-record-type <news-entry>
  (news-entry url title date author content)
  news-entry?
  (url      news-entry-url)                       ;string
  (title    news-entry-title)                     ;string
  (date     news-entry-date)                      ;SRFI-19 date
  (author   news-entry-author)                    ;sxml
  (content  news-entry-content))                  ;sxml

(define* (news-items #:optional (atom (fetch-news)))
  "Return the list of <news-entry> taken from the web site's RSS feed."
  (sxml-match atom
    ((*TOP* (*PI* ,pi ...)
            (atom:feed
             (atom:id ,feed-id)
             (atom:link)
             (atom:title ,feed-title)
             (atom:updated ,feed-updated)
             (atom:entry
              (atom:id ,id)
              (atom:link (@ (href ,link)))
              (atom:title ,title)
              (atom:updated ,updated)
              (atom:author (atom:name ,author))
              (atom:content ,content)
              ,rest ...)
             ...
             ))
     (pk 'feed feed-id feed-title feed-updated)
     (map news-entry
          link title
          (map (cut string->date <> "~Y-~m-~d") updated)
          author content))))

(define (strip-xhtml-namespace tree)
  (define (xhtml-prefixed-symbol? obj)
    (and (symbol? obj)
         (string-prefix? "xhtml:" (symbol->string obj))))

  (match tree
    (((or 'xhtml:input 'xhtml:textarea) ('@ attributes ...) body ...)
     ;; XXX: Savannah's news thing used 'textarea' and 'input' tags for
     ;; verbatim text; convert it.  Sometimes the content is in the 'value'
     ;; attribute rather than in the body.
     (match (assoc 'value attributes)
       (('value value)
        `(div (@ (class "example")) (pre ,value)))
       (_
        `(div (@ (class "example")) (pre ,@body)))))

    (((? xhtml-prefixed-symbol? tag) rest ...)
     (let ((tag (string->symbol (string-drop (symbol->string tag)
                                             (string-length "xhtml:")))))
       `(,tag ,@(map strip-xhtml-namespace rest))))
    (x
     x)))

(define (news-entry->post-file-name entry)
  (define whitespace?
    (cut char-set-contains? char-set:whitespace <>))

  (define (map-character chr result)
    (match chr
      ((? whitespace?)
       (cons #\- result))
      ((or #\& #\! #\: #\,)
       result)
      ((or #\* #\?)
       (cons #\- result))
      (_
       (cons chr result))))

  (string-downcase
   (list->string (string-fold-right map-character
                                    '()
                                    (news-entry-title entry)))))

(define (news-entry->post-sexp entry)
  (let ((date (news-entry-date entry)))
    (define sexp
      `((title . ,(news-entry-title entry))
        (author . ,(news-entry-author entry))
        (date . ,(list 'unquote
                       `(make-date ,(date-nanosecond date)
                                   ,(date-second date)
                                   ,(date-minute date)
                                   ,(date-hour date)
                                   ,(date-day date)
                                   ,(date-month date)
                                   ,(date-year date)
                                   ,(date-zone-offset date))))
        (content . ,(strip-xhtml-namespace (news-entry-content entry)))))

    `(begin
       (use-modules (srfi srfi-19))
       ,(list 'quasiquote sexp))))

(define (write-news-entry-as-post entry)
  (let ((file (news-entry->post-file-name entry)))
    (call-with-output-file (string-append file ".sxml")
      (lambda (port)
        (pretty-print (news-entry->post-sexp entry)
                      port)))))


;; Local Variables:
;; eval: (put 'sxml-match 'scheme-indent-function 1)
;; End:

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Web site news to move to Haunt
  2016-10-27 23:33 ` Ludovic Courtès
@ 2016-10-28  7:28   ` ng0
  2016-10-28 11:46   ` Andy Wingo
  2016-11-04  3:29   ` Eric Bavier
  2 siblings, 0 replies; 6+ messages in thread
From: ng0 @ 2016-10-28  7:28 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Hello Guix!
>
> ludo@gnu.org (Ludovic Courtès) skribis:
>
>> The goal is to manage news using Haunt, and to have a page to display
>> them on the web site, something nicer and more flexible than this:
>>
>>   https://savannah.gnu.org/news/?group=guix
>>
>> And I already enjoy Haunt!  :-)
>
> I’m done!

Great work, thanks!

> Please visit the news pages and report anything wrong!  Also, please
> test the feed.xml file; currently IceCat reports that it “does not
> appear to have any style information associated with it”, and thus
> doesn’t display it nicely.  Maybe Haunt omits to emit (ah ha!) the
> doctype thingie?

For me the feed.xml works (without librejs) with latest icecat.

> Because Savannah’s feeds only include the 20 newest entries¹, Assaf
> Gordon, who works on Savannah, kindly exported the raw Guix news
> database for me (54 entries).
>
> The database contains the input markup, not the XML that appears in the
> feed.  Of course the markup is nothing like Markdown, CommonMark, etc.
>
> So I considered writing a parser in Scheme, but that turned out to be
> trickier than I thought, so I turned back to running Savane’s own parser
> (PHP!).  Since I don’t have PHP on Guix, I ran it on another machine
> where it’s installed: PHP as a service.  And since I couldn’t run all of
> Savane, I just hacked together ‘markup.php’ (attached).  High-quality
> work.
>
> And then I reused the atom-parsing code that used to be in our (www)
> module to get the damn entries and emit them as SXML for Haunt (file
> attached).  Phewww.
>
> I think we should do the same for Guile real soon.
>
> Anyway, big thanks to Assaf!
>
> Ludo’.
>
> ¹ https://lists.gnu.org/archive/html/savannah-hackers-public/2016-10/msg00017.html
>
>
>

-- 
♥Ⓐ  ng0  | ng0.chaosnet.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Web site news to move to Haunt
  2016-10-27 23:33 ` Ludovic Courtès
  2016-10-28  7:28   ` ng0
@ 2016-10-28 11:46   ` Andy Wingo
  2016-11-04  3:29   ` Eric Bavier
  2 siblings, 0 replies; 6+ messages in thread
From: Andy Wingo @ 2016-10-28 11:46 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

On Fri 28 Oct 2016 01:33, ludo@gnu.org (Ludovic Courtès) writes:

>> The goal is to manage news using Haunt, and to have a page to display
>> them on the web site, something nicer and more flexible than this:
>>
>>   https://savannah.gnu.org/news/?group=guix
>>
>> And I already enjoy Haunt!  :-)
>
> I’m done!

Nice!

> I think we should do the same for Guile real soon.

Yeah!  First thing would be to get Assaf to export the news items I
guess.

Super stuff!

A

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Web site news to move to Haunt
  2016-10-27 23:33 ` Ludovic Courtès
  2016-10-28  7:28   ` ng0
  2016-10-28 11:46   ` Andy Wingo
@ 2016-11-04  3:29   ` Eric Bavier
  2 siblings, 0 replies; 6+ messages in thread
From: Eric Bavier @ 2016-11-04  3:29 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

On Fri, 28 Oct 2016 01:33:38 +0200
ludo@gnu.org (Ludovic Courtès) wrote:

> Hello Guix!
> 
> ludo@gnu.org (Ludovic Courtès) skribis:
> 
> > The goal is to manage news using Haunt, and to have a page to display
> > them on the web site, something nicer and more flexible than this:
> >
> >   https://savannah.gnu.org/news/?group=guix
> >
> > And I already enjoy Haunt!  :-)  
> 
> I’m done!

Looks great!

> 
> Please visit the news pages and report anything wrong!  Also, please
> test the feed.xml file; currently IceCat reports that it “does not
> appear to have any style information associated with it”, and thus
> doesn’t display it nicely.  Maybe Haunt omits to emit (ah ha!) the
> doctype thingie?

IceCat reported the missing style for me on the first visit, but then
on a second visit seemed to apply some of its own default styling?
IDK.  I was able to import it into my RSS reader.

> 
> Because Savannah’s feeds only include the 20 newest entries¹, Assaf
> Gordon, who works on Savannah, kindly exported the raw Guix news
> database for me (54 entries).
> 
> The database contains the input markup, not the XML that appears in the
> feed.  Of course the markup is nothing like Markdown, CommonMark, etc.
> 
> So I considered writing a parser in Scheme, but that turned out to be
> trickier than I thought, so I turned back to running Savane’s own parser
> (PHP!).  Since I don’t have PHP on Guix, I ran it on another machine
> where it’s installed: PHP as a service.  And since I couldn’t run all of
> Savane, I just hacked together ‘markup.php’ (attached).  High-quality
> work.
> 
> And then I reused the atom-parsing code that used to be in our (www)
> module to get the damn entries and emit them as SXML for Haunt (file
> attached).  Phewww.

Sounds like fun ;)

`~Eric

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-11-04  3:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-26  8:09 Web site news to move to Haunt Ludovic Courtès
2016-10-26 17:54 ` Christopher Allan Webber
2016-10-27 23:33 ` Ludovic Courtès
2016-10-28  7:28   ` ng0
2016-10-28 11:46   ` Andy Wingo
2016-11-04  3:29   ` Eric Bavier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).