[14:04] oops: http://buscon.rae.es/draeI/SrvltObtenerHtml?TIPO_HTML=2&LEMA=mosca&SUPIND=0&CAREXT=1 [14:05] my current code is in ~/atocha/modules/drae.py [14:11] I'd strip the HTML, split on \d\., and rejoin with my own numbers [14:15] >>> import re [14:15] >>> r_tag = re.compile(r'<[^>]+>') [14:15] >>> r_digit = re.compile(r'\d+\.') [14:15] >>> input = 'word

1. lalala 2. boing' [14:15] >>> input = r_tag.sub('', input) [14:15] >>> parts = r_digit.split(input) [14:15] >>> parts [14:15] ['word ', ' lalala ', ' boing'] [14:15] >>> for i, part in enumerate(parts[1:]): [14:15] ... print str(i + 1) + ')', part.strip() [14:15] ... [14:15] 1) lalala [14:15] 2) boing [14:15] >>> [14:15] proof-of-concept