Ticket #1348: 0002-Fix-missing-headings-when-retrieving-article.patch

File 0002-Fix-missing-headings-when-retrieving-article.patch, 1.0 KB (added by jpichon, 15 years ago)

Missing headings

  • infoslicer/processing/HTML_Parser.py

    From d112a8225a0c99841225f44e5dd60e178dc447d2 Mon Sep 17 00:00:00 2001
    From: Julie Pichon <julie.pichon@gmail.com>
    Date: Sun, 25 Oct 2009 14:11:40 +0000
    Subject: [PATCH 2/2] Fix missing headings when retrieving article
    
    ---
     infoslicer/processing/HTML_Parser.py |    4 ++--
     1 files changed, 2 insertions(+), 2 deletions(-)
    
    diff --git a/infoslicer/processing/HTML_Parser.py b/infoslicer/processing/HTML_Parser.py
    index b99e754..adb6eb0 100644
    a b class HTML_Parser: 
    2828    #=======================================================================
    2929    # These lists are used at the parsing stage
    3030    root_node = "body"
    31     section_separators = ["h3", "h4", "h5"]
    32     reference_separators = ["h1", "h2"]
     31    section_separators = ["h2", "h3", "h4", "h5"]
     32    reference_separators = ["h1"]
    3333    block_elements = ["img", "table", "ol", "ul"]
    3434    #=======================================================================
    3535