IRC logs of #tryton for Wednesday, 2011-07-13

chat.freenode.net #tryton log beginning Wed Jul 13 00:00:01 CEST 2011
2011-07-13 00:12 -!- elbenfreund(~elbenfreu@g225234025.adsl.alicedsl.de) has joined #tryton
2011-07-13 01:11 -!- alimon(~alimon@201.158.247.118) has joined #tryton
2011-07-13 01:28 -!- elbenfreund(~elbenfreu@g225234025.adsl.alicedsl.de) has joined #tryton
2011-07-13 02:29 -!- alimon(~alimon@201.158.247.118) has joined #tryton
2011-07-13 02:58 -!- dfamorato(~dfamorato@2001:470:5:630:cabc:c8ff:fe9b:26b7) has joined #tryton
2011-07-13 05:00 -!- yangoon1(~mathiasb@p549F39C4.dip.t-dialin.net) has joined #tryton
2011-07-13 05:19 -!- gremly(~gremly@200.106.202.91) has joined #tryton
2011-07-13 05:51 -!- helmor(~helmo@46.115.22.167) has joined #tryton
2011-07-13 06:00 -!- sharoon(~sharoon@c-76-109-201-37.hsd1.fl.comcast.net) has joined #tryton
2011-07-13 06:07 -!- alimon(~alimon@189.154.110.53) has joined #tryton
2011-07-13 08:04 -!- enlightx(~enlightx@static-217-133-61-144.clienti.tiscali.it) has joined #tryton
2011-07-13 08:06 -!- bechamel(~user@host-85-201-144-79.brutele.be) has joined #tryton
2011-07-13 08:25 -!- enlightx(~enlightx@static-217-133-61-144.clienti.tiscali.it) has joined #tryton
2011-07-13 08:51 -!- pjstevns(~pjstevns@helpoort.xs4all.nl) has joined #tryton
2011-07-13 08:54 -!- helmor(~helmo@2.212.106.237) has joined #tryton
2011-07-13 09:20 -!- ralf58_(~quassel@dslb-088-071-224-216.pools.arcor-ip.net) has joined #tryton
2011-07-13 09:22 -!- bechamel(~user@cismwks02-virtual1.cism.ucl.ac.be) has joined #tryton
2011-07-13 09:44 -!- nicoe(~nicoe@ced.homedns.org) has joined #tryton
2011-07-13 09:48 -!- elbenfreund(~elbenfreu@f055003074.adsl.alicedsl.de) has joined #tryton
2011-07-13 09:48 -!- cedk(~ced@gentoo/developer/cedk) has joined #tryton
2011-07-13 10:09 -!- enlightx(~enlightx@static-217-133-61-144.clienti.tiscali.it) has joined #tryton
2011-07-13 10:18 -!- pjstevns(~pjstevns@a83-163-46-103.adsl.xs4all.nl) has joined #tryton
2011-07-13 11:23 -!- reichlich(~reichlich@p5793D42F.dip.t-dialin.net) has joined #tryton
2011-07-13 12:06 -!- cheche(cheche@46.25.80.67) has joined #tryton
2011-07-13 12:12 -!- ccomb(~ccomb@94.122.84.134) has joined #tryton
2011-07-13 12:18 -!- mhi(~mhi@p54894D47.dip.t-dialin.net) has joined #tryton
2011-07-13 12:21 -!- ccomb(~ccomb@94.122.84.134) has joined #tryton
2011-07-13 12:53 -!- sharoon(~sharoon@204-232-205-248.static.cloud-ips.com) has joined #tryton
2011-07-13 13:07 -!- ccomb(~ccomb@94.122.106.169) has joined #tryton
2011-07-13 13:11 -!- reichlich(~reichlich@p5793DBDB.dip.t-dialin.net) has joined #tryton
2011-07-13 13:52 -!- uranus(~uranus@96.57.28.107) has joined #tryton
2011-07-13 14:21 -!- vladimirek(~vladimire@adsl-dyn24.78-98-14.t-com.sk) has joined #tryton
2011-07-13 14:28 -!- elbenfreund(~elbenfreu@p54B92C1A.dip.t-dialin.net) has joined #tryton
2011-07-13 14:29 -!- sharoon(~sharoon@2001:470:5:630:e2f8:47ff:fe22:f228) has joined #tryton
2011-07-13 14:54 -!- dfamorato(~dfamorato@2001:470:5:630:cabc:c8ff:fe9b:26b7) has joined #tryton
2011-07-13 14:55 <dfamorato> bechamel: ping
2011-07-13 14:55 <bechamel> dfamorato: hi
2011-07-13 14:55 <dfamorato> bechamel: hi
2011-07-13 14:56 <dfamorato> bechamel: Did you get my e-mail yesterday ?
2011-07-13 14:56 <bechamel> dfamorato: yes and I check the code
2011-07-13 14:56 <bechamel> dfamorato: but didn't took time to test it
2011-07-13 14:56 <bechamel> dfamorato: is it fast ?
2011-07-13 14:56 <dfamorato> bechamel: Great, so what are your toughts
2011-07-13 14:56 <dfamorato> bechamel: It is almost the same time it takes to load the tryton pool
2011-07-13 14:57 <bechamel> dfamorato: my thought is that I did check the sphinx doc and discovered that the python binding are available in the source archive
2011-07-13 14:57 <dfamorato> bechamel: Yes, they are.... for querying the searchd server
2011-07-13 14:57 <bechamel> dfamorato: so I'm curious about the perf penalty of using the python binding vs let sphinx read the db
2011-07-13 14:57 <cedk> dfamorato: I was a little bit suprise to see that you use SQL queries to read Tryton's Model values
2011-07-13 14:58 <dfamorato> bechamel: Which is what we are going to use to implement with your unified search filed
2011-07-13 14:59 <bechamel> cedk: actauly it has the advantage of not interfering with the trytond instance that serve the clients
2011-07-13 14:59 <cedk> bechamel: what about function field, translation etc.
2011-07-13 15:00 <sharoon> cedk: dfamorato: bechamel: probably time to merge two projects - use pysql to generate the sql ??
2011-07-13 15:00 <bechamel> cedk: yes, it's the drawback :)
2011-07-13 15:01 <dfamorato> cedk: We can implement a function on the Postgres to read template_name from ir_translations
2011-07-13 15:01 <dfamorato> bechamel: This sphinx python api -> http://code.google.com/p/sphinxsearch/source/browse/branches/rel201/api/sphinxapi.py
2011-07-13 15:01 <dfamorato> bechamel: It's for querying the searchd instance
2011-07-13 15:01 <dfamorato> bechamel: It's not a native driver or something like that
2011-07-13 15:02 <bechamel> cedk, dfamorato: I think about an intermediate solution: a script that import the trytond pool and use it to feed sphinx, like that the master trytond instance is not directly depedant of spinx
2011-07-13 15:02 <dfamorato> cedk: bechamel , Yes, the current implementation does not require any additional change from the user
2011-07-13 15:03 <dfamorato> cedk: bechamel , Furthermore... future developments, the module developer does not need to worry about full text-indexing or not
2011-07-13 15:04 <dfamorato> cedk: bechamel , So, Tryton users will have the option to implement or not the Sphinx Search if they want....
2011-07-13 15:04 <dfamorato> cedk bechamel, If they want to implement... Just like a standard Tryton module
2011-07-13 15:05 <bechamel> dfamorato: yes, but to be able to index function field, I think the solution is to use xmlpipe2 to push data to sphinx
2011-07-13 15:06 <bechamel> I'm also thinking about indexing attachement but this bring other issues, we will se later if there is enough time to do it
2011-07-13 15:06 <cedk> dfamorato: but how ModelStorage.search will know to use Sphinx or not?
2011-07-13 15:06 <sharoon> bechamel: full text search AFAIK works only on char and text fields... how many of our function fields are actually char / text ?
2011-07-13 15:07 <bechamel> sharoon: those are the issue :), to index attachement we need odt2txt, pdf2txt, etc
2011-07-13 15:07 <dfamorato> bechamel, cedk ; Sharoon is right, the other fields that are not "string" char text are considered attributes
2011-07-13 15:07 <bechamel> sharoon: oh sorry I though you where talking about attachement indexing
2011-07-13 15:07 <dfamorato> cedk bechamel , So, in theory, they could only be used for sorting (and relevance ranking)
2011-07-13 15:08 <bechamel> dfamorato, sharoon: party.full_name and stuffs like that
2011-07-13 15:08 <cedk> and the translation ?
2011-07-13 15:08 <dfamorato> cedk bechamel I get it
2011-07-13 15:08 <cedk> sorry but you have to use the ORM of Tryton to get right values to index
2011-07-13 15:09 <sharoon> dfamorato: so xml pipe is your answer!
2011-07-13 15:09 <dfamorato> cedk: Translation can be retrieved from SQL as well.
2011-07-13 15:09 <bechamel> cedk: what do you think about leaving indexing outside trytond itself ?
2011-07-13 15:10 <sharoon> dfamorato: i think you will also need inherited indexes (indices) for each language that is translatable in tryton
2011-07-13 15:11 <dfamorato> sharoon: You are right. I already inherit data-sources
2011-07-13 15:11 <cedk> bechamel: don't understand
2011-07-13 15:11 <dfamorato> sharoon: I can inherit indexes as well, basically we define which morphology(stem) to use on the inherited index
2011-07-13 15:12 <dfamorato> sharoon: not hard to implement (at least on current implementation)
2011-07-13 15:12 <dfamorato> sharoon: might be harder if I use the xmlpipe
2011-07-13 15:12 <bechamel> cedk: my initial idea was to implement a hook in create and read, in order to push data to sphinx
2011-07-13 15:12 <cedk> bechamel: yes agree with that
2011-07-13 15:12 <bechamel> cedk: but this means that if sphinx is down or slow, trytond will get slow/unresponsive
2011-07-13 15:12 <dfamorato> bechamel: That is exactly my concern.... pushing data to sphinx
2011-07-13 15:13 <sharoon> dfamorato: is there a concept of push to sphinx ? i have only seen sphinx pull data
2011-07-13 15:13 <cedk> bechamel: but it was decided to use a mq
2011-07-13 15:13 <cedk> bechamel: mqueue
2011-07-13 15:13 <bechamel> cedk: not really decided, it was an option
2011-07-13 15:13 <bechamel> cedk: mq means another dependancy
2011-07-13 15:13 <cedk> bechamel: or just a table in Tryton, I don't care
2011-07-13 15:14 <bechamel> cedk: what appends is sphinx is down, and the trytond server is under heavy use ?
2011-07-13 15:14 <bechamel> cedk: ok this is maybe better
2011-07-13 15:15 <cedk> bechamel: and what happens if the database server is down but trytond is running ?
2011-07-13 15:16 <bechamel> cedk: yes of course, but adding a new spof is not a good idea (especialy it's possible to have trytond running correctly while sphinx is down)
2011-07-13 15:17 <cedk> bechamel: just catch connection error to sphinx and fallback to default behavior
2011-07-13 15:17 <cedk> bechamel: and put a log message
2011-07-13 15:17 <bechamel> cedk: and how to know what we need to index once sphinx is back ?
2011-07-13 15:18 <cedk> dfamorato: so now, you have only a pull method to fill sphinx ?
2011-07-13 15:18 <cedk> bechamel: because the table sphinx_job will be filled
2011-07-13 15:18 <dfamorato> cedk: No so sure, I am checking... xmlpipe2 seems to be pull
2011-07-13 15:18 <cedk> dfamorato: what is the current design?
2011-07-13 15:18 <sharoon> dfamorato: pull / push ?
2011-07-13 15:19 <cedk> dfamorato: have you a small schema?
2011-07-13 15:19 <dfamorato> cedk: This new xmlpipe ( http://sphinxsearch.com/docs/2.0.1/xmlpipe2.html ) appears to enable push streams
2011-07-13 15:19 <bechamel> cedk: My idea of a "sphinx table" is to keep the id of the time of the last indexed record, and a cron would be responsible to compare this time with current recored and index the new/updated records
2011-07-13 15:19 <dfamorato> cedk: Sorry, no schema... current module design is pull from tryton pool
2011-07-13 15:20 <bechamel> cedk: no need to copy in a table records that are already in the db
2011-07-13 15:20 <bechamel> cedk: the only exception I see is when we delete records
2011-07-13 15:20 <bechamel> cedk: we have to store all the ids (until the next push)
2011-07-13 15:21 -!- saxa(~sasa@189.26.255.43) has joined #tryton
2011-07-13 15:22 <dfamorato> bechamel: using xmlpipe2 we can put deleted records in an "kill list"
2011-07-13 15:22 <bechamel> dfamorato: yes, but I'm talking about the "sphinx is down" scenario
2011-07-13 15:23 <bechamel> I mean, instead of trying to push data synchronuously and put stuff in table/queue when there is a problem, I propose to just use a cron that push all updated data from his last run and at the end just write in the db the datetime if his last run
2011-07-13 15:24 <bechamel> *of his last run
2011-07-13 15:24 -!- pheller(~pheller@c1fw226.constantcontact.com) has joined #tryton
2011-07-13 15:24 <dfamorato> bechamel: An what is the time that you intend to run this cron
2011-07-13 15:24 <dfamorato> bechamel: every minute ?
2011-07-13 15:25 <bechamel> dfamorato: yes or every 10 second
2011-07-13 15:25 <sharoon> bechamel: i feel this is too close for a cron task.... might be better to make it async then and execute on an as and when data appears ?
2011-07-13 15:25 <bechamel> and IMO it should run in a separate process
2011-07-13 15:25 <sharoon> bechamel: probably use triggers
2011-07-13 15:26 <bechamel> sharoon: triggers are working synchronuously, no?
2011-07-13 15:26 <sharoon> bechamel: could be easily made async
2011-07-13 15:27 <bechamel> sharoon: it will cost an extra thread for each db access
2011-07-13 15:28 <cedk> I'm not sure about the design of storing what need to be updated because you can not know
2011-07-13 15:28 <bechamel> cedk: know what ?
2011-07-13 15:29 <sharoon> bechamel: cedk: i think there is a strong confusion here about `search` and `full text search`
2011-07-13 15:30 <cedk> bechamel: when to update the index for a field
2011-07-13 15:30 <cedk> sharoon: I don't think
2011-07-13 15:30 <sharoon> cedk: are you planning to completely replace ModelSQL.`search` with sphinx functionality ?
2011-07-13 15:30 <cedk> sharoon: no just the like operator
2011-07-13 15:31 <bechamel> sharoon: the last time we talked about providing it trough search_rec_name
2011-07-13 15:31 <bechamel> cedk: do you have an example where we don't know what to index ?
2011-07-13 15:32 <cedk> bechamel: like that no, but I'm pretty sure we got function fields that depends on the value of other Models
2011-07-13 15:32 <cedk> bechamel: and even if we don't have right now in Tryton, it is possible to get it
2011-07-13 15:32 <bechamel> cedk: good point
2011-07-13 15:33 <cedk> bechamel: and I don't want we implement a store=True like OE
2011-07-13 15:33 <bechamel> cedk: :)
2011-07-13 15:33 <bechamel> cedk: and with cache_invalidation methods :)
2011-07-13 15:33 <cedk> so I'm wondering if the sphinx index should not be filled like a google does with web
2011-07-13 15:34 <cedk> with perhaps a table with modified record to be indexed first
2011-07-13 15:34 <bechamel> cedk: actually the problem of "field that depend of other field" is still there event with a queue (or queue-like table)
2011-07-13 15:35 <bechamel> cedk: ok so whe need a way to constantly re-index the db in the background
2011-07-13 15:36 <dfamorato> bechamel: you mean re-index the whole db ?
2011-07-13 15:36 <dfamorato> dfamorato: each time ?
2011-07-13 15:36 <bechamel> dfamorato: 1) yes 2)no
2011-07-13 15:36 <cedk> dfamorato: yes but like a crawler
2011-07-13 15:37 <cedk> dfamorato: + a specific list a record to index in priority (last modified and newly created)
2011-07-13 15:37 <dfamorato> bechamel: re-indexing the database is what thakes most of the time on sphinx search
2011-07-13 15:38 <bechamel> dfamorato: yes but the idea is to do it slowly, to not kill postgresql
2011-07-13 15:38 <cedk> dfamorato: not a re-index but update on random selected records
2011-07-13 15:38 <bechamel> dfamorato: like index 1000 records and then sleep some time
2011-07-13 15:38 <dfamorato> cedk: we can index the "base index" + the DELTA
2011-07-13 15:38 <cedk> we also need a table for removed records
2011-07-13 15:38 <cedk> dfamorato: don't understand
2011-07-13 15:39 <bechamel> cedk: the deletion list will be like the "hot record" list
2011-07-13 15:39 <dfamorato> cedk: if we build a master index every 12 hours(example), we can then build the index of the DELTA (changed) data for the next 11hours 59 minutes..
2011-07-13 15:40 <dfamorato> cedk bechamel, But with this
2011-07-13 15:40 <dfamorato> "
2011-07-13 15:41 <bechamel> cedk: delta in sphinx vocabulary is what we call the list of changed records
2011-07-13 15:41 <cedk> dfamorato: ok but why rebuild the index ?
2011-07-13 15:42 <bechamel> cedk: to play the role of the crawler
2011-07-13 15:42 <bechamel> cedk: but in one shot
2011-07-13 15:42 <cedk> bechamel: that's the delta update
2011-07-13 15:42 <dfamorato> cedk bechamel s/\"/ with this new proposed way, then we would need to have the tryton server make a table of all changes occoured in specific period
2011-07-13 15:43 <bechamel> dfamorato: yes
2011-07-13 15:43 <dfamorato> cedk: think of sphinx as a mysql instance/database
2011-07-13 15:44 <cedk> dfamorato: I know what is sphinx
2011-07-13 15:44 <cedk> but rebuild the index on fix time without any reason is not good
2011-07-13 15:44 <bechamel> dfamorato: but we don't care about the period: when a record is changed, his id is added to the table, when it is indexed, it gets removed from the table
2011-07-13 15:44 <cedk> on very large database, it could take a lot of time
2011-07-13 15:44 <dfamorato> bechamel: got it... when pull from db.. then pop from db
2011-07-13 15:45 <dfamorato> cedk: yes, it could like a lot of time
2011-07-13 15:45 <sharoon> cedk: bechamel: "we don't care about the period" -> that cannot be true.... you create a product, search for the product in search_rec_name to create a sale order and it wont be there
2011-07-13 15:45 <dfamorato> cedk: you need to rebuild indexes in the case you want to add new fields/attributes to be indexed
2011-07-13 15:45 <cedk> dfamorato: so don't rebuild index on fixed time, we just need to have an option to do it in case of trouble
2011-07-13 15:45 <bechamel> dfamorato: yes like a queue actualy, but without adding a new depency
2011-07-13 15:46 <cedk> dfamorato: what ? you can not append to the index?
2011-07-13 15:46 <bechamel> cedk, dfamorato: actually nobody want to re-index all every time, I don't know why you talk about it
2011-07-13 15:46 <bechamel> cedk: yes this is what delta are for
2011-07-13 15:46 <dfamorato> cedk: Yes, you can append to index... but if you want to add new "column
2011-07-13 15:46 <cedk> dfamorato: why new column ?
2011-07-13 15:47 <dfamorato> cedk: new "db column" to be indexed.. then you have to re-index data
2011-07-13 15:47 <cedk> dfamorato: the Model structure did not changed when you add a new record
2011-07-13 15:47 <cedk> dfamorato: which new column?
2011-07-13 15:48 <dfamorato> cedk: let's say in the future you say a module needs an extram column to be indexed... then you need to re-index data... But it will not occour frequently
2011-07-13 15:48 <dfamorato> cedk: It would be an optional manual step to rebuild index from scratch
2011-07-13 15:48 <bechamel> sharoon: except if "hot records" are pushed rapidly
2011-07-13 15:48 <cedk> dfamorato: yes as I said
2011-07-13 15:51 <bechamel> cedk: so, if we are ok to use a table to put record ids that have changed, how to use it: 1) all the time 2) only when sphinx is down
2011-07-13 15:53 <cedk> bechamel: all the time
2011-07-13 15:53 <dfamorato> cedk bechamel : Just to make sure I understant corretcly. We are using a table instead of a MessageQueue (rabbit) because we don't want an extra dependency ?
2011-07-13 15:53 <dfamorato> cedk bechamel : Is it the intention to make full-text search the default implementation then ?
2011-07-13 15:54 <dfamorato> cedk bechame : Because in order to implement full text search.. we already have a dependency which is the Sphinx Server itself...
2011-07-13 15:54 <bechamel> dfamorato: yes using rabbit just for this is a bit overkill
2011-07-13 15:55 <dfamorato> cedk bechamel : So, if someone want's the bennefit of the fulltext search, IMHO would not be that hard/nonsense
2011-07-13 15:55 <bechamel> dfamorato: full text search should be an option
2011-07-13 15:55 <dfamorato> cedk bechamel : to use a message queue
2011-07-13 15:57 <cedk> dfamorato: sphinx will be an option in Tryton configuration
2011-07-13 15:57 <cedk> dfamorato: but we don't need of an other software to just storing the record to process
2011-07-13 15:58 <cedk> dfamorato: your sphinx script can do the job
2011-07-13 15:58 <dfamorato> cedk: ok.. got it
2011-07-13 15:59 <bechamel> dfamorato: imo a message queue is overkill because 1) we are not doing multi-process communication 2) whe have already postgresql
2011-07-13 15:59 <dfamorato> bechamel: I understand
2011-07-13 15:59 <cedk> bechamel: but it would be good to have a long pulling wait to retrieve record from the table
2011-07-13 16:00 <bechamel> cedk: is it possible ?
2011-07-13 16:00 <dfamorato> cedk bechamel We can pull data in ranges and steps....
2011-07-13 16:00 <dfamorato> cedk bechamel: Sphinx allow us to tell how much rows to pull.... default is 1024
2011-07-13 16:01 -!- alimon(~alimon@189.154.110.53) has joined #tryton
2011-07-13 16:01 <bechamel> I found this http://www.postgresql.org/docs/8.4/static/sql-notify.html
2011-07-13 16:01 <bechamel> but "There is no NOTIFY statement in the SQL standard. "
2011-07-13 16:01 <cedk> bechamel: not standard
2011-07-13 16:02 <bechamel> ACTION back in 2 min
2011-07-13 16:03 <cedk> we must forget about postgresql, we must only use trytond
2011-07-13 16:06 <bechamel> cedk: so we must select the table and sleep in a loop, there a no way to "pull wait"
2011-07-13 16:08 <bechamel> cedk: or with a signal between threads ?
2011-07-13 16:09 <dfamorato> cedk bechamel : I think we can do something in sphinx
2011-07-13 16:10 <dfamorato> cedk bechamel : SPHINX DOC ( ranged query throttling, in milliseconds, optional, default is 0 which means no delay, enforces given delay before each query step)
2011-07-13 16:11 <dfamorato> cedk bechamel : Sorry.. that is for SQL data sources
2011-07-13 16:11 <bechamel> dfamorato: yes
2011-07-13 16:11 <bechamel> dfamorato: this makes no sense if we push data ourselves :)
2011-07-13 16:12 <dfamorato> bechamel: got it
2011-07-13 16:18 <cedk> I propose first implementation to be a simple loop with a sleep
2011-07-13 16:18 -!- zodman(~zodman@foresight/developer/zodman) has joined #tryton
2011-07-13 16:18 <cedk> after that we could add a interprocess for communication if needed
2011-07-13 16:19 <bechamel> cedk: ok
2011-07-13 16:19 <dfamorato> cedk bechamel : So, the trytond new table and worflow. Should I implement that or bechamel implements ?
2011-07-13 16:20 <bechamel> next decision, trigger vs write/create/delete overloading
2011-07-13 16:20 <bechamel> dfamorato: I propose that you implement it, but I will help you to design it
2011-07-13 16:21 <dfamorato> bechamel: ok, great... so it's decided then....
2011-07-13 16:21 <bechamel> dfamorato: have you already writen a tryton module ?
2011-07-13 16:21 <cedk> bechamel: no need for a module
2011-07-13 16:21 <cedk> everything is in the base
2011-07-13 16:22 <bechamel> cedk: yes but it will looks like a module, it's just that it will be in another directory
2011-07-13 16:22 <cedk> bechamel: in ir/
2011-07-13 16:22 <dfamorato> bechamel: No, i have not written a tryton module
2011-07-13 16:22 <bechamel> cedk: ir/sphinx_search.py ?
2011-07-13 16:23 <cedk> bechamel: I don't think we need to named it with sphinx
2011-07-13 16:23 <cedk> think generic
2011-07-13 16:24 <bechamel> cedk: so like with backend/postgresql backend/sqlite ?
2011-07-13 16:24 <cedk> bechamel: no
2011-07-13 16:25 <cedk> I think we don't need of a table in trytond
2011-07-13 16:25 <cedk> the sphinx script could just look at each Model for the latest modified
2011-07-13 16:26 <cedk> and keep the timestmap per object
2011-07-13 16:26 <cedk> perhaps in a sqlite DB
2011-07-13 16:26 <cedk> or a pickled dict
2011-07-13 16:28 <cedk> dfamorato: did you understand?
2011-07-13 16:28 <bechamel> cedk: what about deleted records ??
2011-07-13 16:28 <dfamorato> cedk: sorry but i did not understand
2011-07-13 16:32 <cedk> bechamel: the search engind could remove them on the fly
2011-07-13 16:32 <cedk> and the crawler could also do it
2011-07-13 16:32 <cedk> dfamorato: could you search on id in sphinx ?
2011-07-13 16:33 <dfamorato> cedk: yes, you can search an ID
2011-07-13 16:33 <dfamorato> cedk: and each document id must be unique
2011-07-13 16:33 <dfamorato> cedk: so, basically i import the "postgres" id as of this moment
2011-07-13 16:33 <cedk> bechamel: so we could have a crawler that remove deleted record from the index
2011-07-13 16:34 <bechamel> cedk: so you mean: loop on all the sphinx index and test if the record still exist ?
2011-07-13 16:35 <cedk> bechamel: yes
2011-07-13 16:35 <bechamel> cedk: :/
2011-07-13 16:37 <cedk> bechamel: it is like the crawler idea
2011-07-13 16:38 <cedk> bechamel: and I think that the result of querying sphinx will be put in SQL clause: id in (..)
2011-07-13 16:38 <bechamel> cedk: ok, but it means that when we search we must check if ids returned by sphinx is still in the db
2011-07-13 16:38 <bechamel> cedk: ok
2011-07-13 16:38 <cedk> bechamel: will be done by postgres
2011-07-13 16:38 <bechamel> dfamorato: still with us ?
2011-07-13 16:39 <dfamorato> bechamel: yes
2011-07-13 16:39 <bechamel> dfamorato: you understand the idea ?
2011-07-13 16:40 <bechamel> so we are back to the idea of using an independant script, but instead of letting sphinx read the db we will push data to it with xml_pipe
2011-07-13 16:40 <dfamorato> dfamorato: we can put the document to be deleted on the "killlist"
2011-07-13 16:41 <bechamel> dfamorato: yes
2011-07-13 16:41 <dfamorato> bechamel: which will remove the document from the index on the next indexing
2011-07-13 16:42 <dfamorato> bechamel: but still, where to store the data before the sphinx indexing/synchronization
2011-07-13 16:42 <dfamorato> bechamel: cedk said no need to a trytond table
2011-07-13 16:43 <bechamel> dfamorato: deleting document will also work like a crawler, wich means walking trough the sphinx index and check if the record is still in tryton, if not: kill it
2011-07-13 16:45 <dfamorato> bechamel: sorry, this looks contradictory, trydond will push data to sphinx through xmlpipe2.... but on data deletion sphix needs to connect to tryton ?
2011-07-13 16:47 <cedk> dfamorato: it is not trytond to push data in sphinx but an independ script that will connect to trytond
2011-07-13 16:48 <bechamel> dfamorato: no the same script that push data will also check sphinx content to see if there are stuff to delete (but I still don't know how to do it)
2011-07-13 16:48 <dfamorato> cedk: Can i use proteus to connect to tryton then ?
2011-07-13 16:48 <cedk> it could even be multi-threaded
2011-07-13 16:48 <cedk> dfamorato: no, you must import trytond
2011-07-13 16:49 <bechamel> dfamorato: do you know if its possible to search by id range, E.G. search for all records whose ids are between 0 and 1000 ?
2011-07-13 16:50 <cedk> dfamorato: with protues it will generate too much connection etc.
2011-07-13 16:50 <dfamorato> bechamel: we can index by ranges.....
2011-07-13 16:50 <dfamorato> bechamel: not so sure if we can search by ranges
2011-07-13 16:52 <bechamel> dfamorato: do you have a sphinx instance running ?
2011-07-13 16:52 <bechamel> dfamorato: I see that one can do search query like "@name Joe"
2011-07-13 16:53 <bechamel> dfamorato: so maybe "@id <1000"
2011-07-13 16:53 <dfamorato> bechamel: I have one.. but I need vpn to acces this server.....
2011-07-13 16:54 <dfamorato> bechamel: give couple minutes... I know we can select which search index to match
2011-07-13 16:54 <bechamel> dfamorato: another solution is to search for the empty string and use limit and offset
2011-07-13 16:54 <bechamel> dfamorato: no problem
2011-07-13 16:55 <cedk> bechamel: why do you want that?
2011-07-13 16:55 <dfamorato> bechamel: here is what can be queried from command line https://gist.github.com/1080457
2011-07-13 16:57 <bechamel> cedk: to get the ids that are in sphinx in order to check if they are stil in postgres (and if not kill them)
2011-07-13 16:57 <cedk> bechamel: ok
2011-07-13 16:58 <cedk> bechamel: otherwise it can loop on all index entries
2011-07-13 16:58 <bechamel> cedk: yes but how to get them
2011-07-13 16:58 <bechamel> I found this http://sphinxsearch.com/docs/2.0.1/sphinxql-reference.html
2011-07-13 16:59 <bechamel> this allow to query sphinx with an sql syntax
2011-07-13 17:01 <dfamorato> bechamel: Yes, we can do that
2011-07-13 17:02 <dfamorato> bechamel: But then, we need to write sql syntax on this query, make sphinx search server listen on that port as well
2011-07-13 17:02 <dfamorato> bechamel: An the sphinx_python API do not use the SphinxQL language
2011-07-13 17:03 <dfamorato> bechamel: So, not all functional accessible on SphinxQL are accessible by the python API
2011-07-13 17:05 <bechamel> dfamorato: anyway, if the python api offer a way to search by id, it's also good
2011-07-13 17:06 <bechamel> dfamorato: actually any methods that allow us consistently to walk trough all the sphinx records is ok
2011-07-13 17:07 <dfamorato> bechamel: SPHINX DOC = Query(self, query, index='*', comment='') method of sphinxapi.SphinxClient instance
2011-07-13 17:08 <dfamorato> bechamel: i am trying to query by id.... could figure out a way yet
2011-07-13 17:10 <bechamel> dfamorato: I saw "@name Joe" type of queries here http://sphinxsearch.com/docs/2.0.1/extended-syntax.html
2011-07-13 17:12 <dfamorato> bechamel: maybe we will have to store the id as id and also as string_filed
2011-07-13 17:13 <dfamorato> bechamel: in order to be matched
2011-07-13 17:13 <dfamorato> bechamel: i'm on a call, will be back in 5 min
2011-07-13 17:14 -!- helmor(~helmo@2.209.26.248) has joined #tryton
2011-07-13 17:17 <bechamel> dfamorato: actually is it possible to connect and query sphinx with the mysql client (and so use sphinxQL)
2011-07-13 17:18 <dfamorato> bechamel: yes, yes it is
2011-07-13 17:20 <bechamel> dfamorato: so, quick recap: the script will create two threads:
2011-07-13 17:21 <bechamel> one that will crawl the ids in sphinx, check if they are still in tryton, and delete them from sphinx if they are no more in tryton
2011-07-13 17:24 <bechamel> another that will check for new record in tryton (newer that the last record seen before), push them in sphinx and store the datetime of the last record he pushed in a small file/sqlite db
2011-07-13 17:32 <cedk> bechamel: + 1 thread that will crawl any Tryton's record
2011-07-13 17:36 <bechamel> cedk: oh yes I forgot that one
2011-07-13 17:39 -!- gremly(~gremly@200.106.202.91) has joined #tryton
2011-07-13 17:43 <dfamorato> bechamel: sorry for the daly, just finished my call... had to take it
2011-07-13 17:43 <dfamorato> bechamel: so, I got it... will will come up with a draft and if I have any question, I will let you guys know
2011-07-13 17:44 <cedk> dfamorato: release early, release often :-)
2011-07-13 17:45 <dfamorato> cedk: Yes, it's a bad practice (or not) that i have to not release code that does not glue properly...
2011-07-13 17:45 -!- woakas(~woakas@200.106.202.91) has joined #tryton
2011-07-13 17:46 <cedk> dfamorato: by release I mean a codereview for us
2011-07-13 17:47 <dfamorato> cedk: yes, I got that... didn't push any code to codereview yet....
2011-07-13 17:47 <dfamorato> cedk: afraid of the coments :)
2011-07-13 17:47 <bechamel> cedk, dfamorato: I discovered that with github it's possible to leave comment on commits, nice feature
2011-07-13 17:47 <dfamorato> cedk: and the lack of tests
2011-07-13 17:48 <dfamorato> bechamel: it is possible... also, i can edit code directly on github.... can create a branch of the project, work on new features and then merge back to core
2011-07-13 17:49 <dfamorato> bechamel: Github is awesome.... and it works seamlessly with mercurial as well...
2011-07-13 17:50 <dfamorato> bechamel: You can clone my project as a mercurial project.. contribute to it and then make a pull request to me... http://hg-git.github.com/
2011-07-13 17:52 <bechamel> dfamorato: IMO it's better to do the codereview with the usual tool in order to centralize stuff
2011-07-13 17:53 <bechamel> dfamorato: but using githug to store your repo is perfectly ok
2011-07-13 17:54 <dfamorato> dfamorato: I understand. It is not my plan to change the workflow of the standard module developement in tryton. I am just advocating Gihub for your future personal projects non-tryton related =D
2011-07-13 17:55 -!- sharoon(~sharoon@2001:470:5:630:e2f8:47ff:fe22:f228) has joined #tryton
2011-07-13 17:56 <dfamorato> bechamel: Well, if that is all then, I will go for lunch... Now finals are over, so I should be on IRC everyday
2011-07-13 18:05 <bechamel> dfamorato: ok, enjoy your meal :)
2011-07-13 18:05 <bechamel> ACTION leave office, bbl
2011-07-13 18:14 -!- vladimirek(~vladimire@adsl-dyn24.78-98-14.t-com.sk) has joined #tryton
2011-07-13 19:51 -!- pjstevns(~pjstevns@helpoort.xs4all.nl) has joined #tryton
2011-07-13 19:54 -!- pjstevns(~pjstevns@helpoort.xs4all.nl) has left #tryton
2011-07-13 19:59 -!- plantian(~ian@c-67-169-72-36.hsd1.ca.comcast.net) has joined #tryton
2011-07-13 20:00 -!- chrue(~chrue@host-091-097-191-037.ewe-ip-backbone.de) has joined #tryton
2011-07-13 20:01 <reichlich> is anywhere some documentation about the workflow model?
2011-07-13 20:03 <cedk> reichlich: nope, but you can have a look at the OE one, it should be still almost valid
2011-07-13 20:06 <reichlich> cedk, great
2011-07-13 20:07 -!- bvillasanti(~bruno@186-129-248-247.static.speedy.com.ar) has joined #tryton
2011-07-13 20:15 -!- sharoon(~sharoon@2001:470:5:630:e2f8:47ff:fe22:f228) has joined #tryton
2011-07-13 20:17 <sharoon> cedk: your mail says "Sometimes we indent with 4 spaces and other times with 4 spaces, I think this is wrong."
2011-07-13 20:17 <sharoon> cedk: i did not understand
2011-07-13 20:21 <cedk> sharoon: s/4 spaces/8 spaces/
2011-07-13 20:21 <sharoon> cedk: ok, so its just a typo
2011-07-13 20:32 -!- sharoon(~sharoon@2001:470:5:630:e2f8:47ff:fe22:f228) has left #tryton
2011-07-13 20:43 -!- bechamel(~user@host-85-201-144-79.brutele.be) has joined #tryton
2011-07-13 20:57 -!- chrue1(~chrue@dyndsl-091-096-014-184.ewe-ip-backbone.de) has joined #tryton
2011-07-13 21:02 -!- ccomb1(~ccomb@94.122.99.232) has joined #tryton
2011-07-13 21:47 -!- ecarreras(~under@unaffiliated/ecarreras) has joined #tryton
2011-07-13 22:17 -!- nicoe(~nicoe@146.81-247-81.adsl-dyn.isp.belgacom.be) has joined #tryton
2011-07-13 22:23 -!- mhi(~mhi@p54894D47.dip.t-dialin.net) has joined #tryton
2011-07-13 23:09 -!- elbenfreund1(~elbenfreu@p54B95832.dip.t-dialin.net) has joined #tryton
2011-07-13 23:16 -!- dfamorato(~dfamorato@173-9-190-185-miami.txt.hfc.comcastbusiness.net) has joined #tryton
2011-07-13 23:26 -!- bvillasanti(~bruno@186-129-248-247.static.speedy.com.ar) has left #tryton

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!