Mea Culpa https://amathur.org My Explorations & Adventures (Visit synesis.in for more details) Wed, 13 Nov 2024 10:07:17 +0000 en-US hourly 1 https://wordpress.org/?v=6.7 213970395 Enhancing LLMs: RAG & Graph RAG https://amathur.org/2024/11/13/enhancing-llms-rag-graph-rag/ https://amathur.org/2024/11/13/enhancing-llms-rag-graph-rag/#respond Wed, 13 Nov 2024 10:06:31 +0000 https://amathur.org/?p=208 As we know, Large Language Models have come as one prominent  AI Application to generate text,image or video. Pure innovation is making strides everywhere. Its application is limitless from ChatBots to Langchain based applications, Market is flooded with various LLMs from key providers like OpenAI, Meta, Google, Anthropic. etc., depending on domain & size.

However these models are a bit older with files and data they are based on, we would like to enhance them with our local organisation’s private document, and provide that information to LLMs as context.

Aim  being when an LLM is asked a question, it does not just rely on what it already knows, instead, it first extracts relevant information from local knowledge sources, assuring generated outputs references from a vast amount of contextually enriched data (local files such as PDFs,DOC ..).

However,  challenge being  accuracy of retrieved information and data source heterogeneity and difficulties with ambiguous queries & clear understanding of context.

Engineers from Microsoft, have come up with sophisticated retrieval algorithms that can better understand the semantics of a query and could improve the relevance of fetched documents. Followed by good & efficiently indexing the knowledge base to speed up the process.

Called GaphRAG, RAG stores information in rows & columns of table databases, whereas GraphRAG stores it in Edges (can have properties) and Nodes (data record) of a graph, a difference. An edge can store additional information, say if Node represents Person, this can store its Name, Designation, Address etc. Queries can connect multiple graphs.

Magic lies in Knowledge Graphs, integrating graph databases with LLMs to enrich the model’s context before generating a response.

An LLM-generated knowledge graph built using GPT-4 Turbo.

@https://microsoft.github.io/graphrag/

I used OpenAI embeddings and Neo4J’s movies database for exploration, using its Cypher language for queries. Modelled entities as Nodes and Vertices. 

Some popular graphDB offerings are from Ontotext, NebulaGraph and Neo4J.

Suggest, refer to https://github.com/microsoft/graphrag for more details.

Please feel free to contact me (asheesh.mathur@gmail.com) for any clarifications

]]>
https://amathur.org/2024/11/13/enhancing-llms-rag-graph-rag/feed/ 0 208
Microsoft Copilot – AI to Common Man’s Life https://amathur.org/2024/03/17/microsoft-copilot-ai-to-common-mans-life/ https://amathur.org/2024/03/17/microsoft-copilot-ai-to-common-mans-life/#respond Sun, 17 Mar 2024 08:25:55 +0000 https://amathur.org/?p=204 Looks Microsoft is encashing it’s investment of few billion dollars in OpenAI, smartly. By integrating the trained Large Language Model, wisely. Attracting it’s users base of Edge, Bing, MS 365 suite of products.

Churning end users data in MS Graph, blends it with huge prebuilt model (Open AI) to enhancing product range.

An example of integrating, packaging & offers lots of services on it’s pre existing office suite & new ones.

ChatGPT, its API have a slew of interesting uses.

Let me try to give a quick snap, diagram below captures the interaction & flow of prompts responses, MS Graph. Provides access to data stored across Microsoft 365 service, captures important entities from users data. Grounding happens here, zoom to view details,

Copilot sends modified prompt to LLM & receives response.

One question is because copilot sends every single thing to Open AI, but you don’t want this to happen because of privacy.

Prompts, responses and grounding data (MS Graph) aren’t used to train model. OpenAI has no access to the data or the model

The documents, mail etc. of a user is totally protected, peace of mind.

Comes loaded with Plugins etc. to connect it to other systems.

Architects & Developers, it’s worth customising it within Azure AI Studio.

Microsoft Copilot is integrated in Microsoft 365, AI assistant that’s available in 365 applications, here’s illustrative features.

Word: Create, summarize, comprehend, refine, & elevate documents.

PowerPoint: Transform existing written documents into decks complete with speaker notes & sources.

Excel: Analyze data, generate graphs, and more.

OneNote: Create note etc.

Power platform is integrated in a lot of stuff.

Helps in Generating images from your ideas, Chatting, Finding information about images, Adjusting your PC’s settings, list is long.

MS users can experience these features within these applications.

It’s also available to others who can use its web interface and experience chat, well organized, presented & categorized.

Limit of tokens can be a bottleneck with it, considering Google’s Gemini – 1.5

On the other hand Google renamed its conversational AI tool Bard to Gemini (ver. 1.5) with support staggering 10 million tokens, way beyond MS. It’s a new Language Model but its integration is missing so far.

May be it will come out with something new in near future.

I wonder it’s human brain & imagination, behind AI’s every day use.

Just a tip of Iceberg.

Please feel free to contact me (asheesh.mathur@gmail.com) for further details or clarifications.

]]>
https://amathur.org/2024/03/17/microsoft-copilot-ai-to-common-mans-life/feed/ 0 204
Mathematics: Magic Behind Artificial Intelligence & Machine Learning https://amathur.org/2023/12/27/mathematics-magic-behind-artificial-intelligence-machine-learning/ https://amathur.org/2023/12/27/mathematics-magic-behind-artificial-intelligence-machine-learning/#respond Wed, 27 Dec 2023 08:19:41 +0000 https://amathur.org/?p=200 As we all know, new advances in Artificial Intelligence are making waves these days, there are rumours that its major reason for Layoffs in industry. Though nothing to fear, its a matter of realignment to explore new ideas of its use.

I have been playing with Machine Learning well before evolution of major frameworks like TensorFlow or PyTorch. Used to work at grass root level doing multidimensional Matrix & its Manipulation, Statistics & Probability.

Strangely during my formative years while learning CS & Algorithms, found the concepts were part of 1860’s Maths books in my library @St. Stephens.

New frameworks & libraries hide all these complexities under the hood & many a times we tend to ignore maths underneath.

However, to be successful, understanding basics will go a long way in this journey. , creating new systems & frameworks by hand helps in long run, builds solid foundation, specially research.

Based on my experiences, it’s Maths makes Machines Learn.

For example in LLMs, interpretation of words based on Vectors a the word tensor in Tensor Flow is matrices and its manipulation..

Numbers (Vectors) help in deciphering interpretation & relationship between words in a sentence, sounds amazing!

Path breaking Transformer Paper in 2017 is all about Maths under hood so is the case with Generative AI , like Diffusion Models & other innovations are basically Mathematics & Probability at play.

All those who want to jump on the new bandwagon, makes sense to brush class XII maths and focus on building blocks Matrices, Calculus, Probability and its Distributions, Statistics. This will makes your journey enjoyable.

You can contact me @ asheesh.mathur@gmail.com for any help and clarifications. I offer trainings and consultancy as well

Time is changing, but its old wine in new bottle!!

Best Wishes

Asheesh

]]>
https://amathur.org/2023/12/27/mathematics-magic-behind-artificial-intelligence-machine-learning/feed/ 0 200
Entropy & Stable Diffusion (Generative AI) https://amathur.org/2023/10/09/entropy-stable-diffusion-generative-ai/ https://amathur.org/2023/10/09/entropy-stable-diffusion-generative-ai/#respond Mon, 09 Oct 2023 15:52:52 +0000 https://amathur.org/?p=191 When I was in secondary school, heard the term Entropy in science, randomness is increasing with time, used to wonder about this phenomenon & it’s potential uses.

“World constantly is moving towards disorientation”

Formally studied as a part of physics…thermodynamics (second law).

Voila, decades later realised randomness, its beauty lies in computer sciences (AI) as well.

Entropy is participant during taming/training of Machines(ML) & Information science. It quantifies data uncertainty, an indication of how much additional information is required for more accurate predictions.

History aside, concept like Diffusion and it’s Latent counterpart Stable Diffusion, brain behind current Text-to-Image/Image-to-Image generation. AI & Generative innovations like OpenAI’s DALL-E Midjourney & Stable Diffusion have revolutionized the way we interact with images.

Text-conditioned models can efficiently generate images based on text description.

How textual inputs can generate a unique unseen image is real wonder.

Deciphering how random(Gaussian) noise makes all the difference is amazing.

Noise is not the only option, new advances and research are emerging.

It seems challenge building Imagination in Machines, disturbing artist community!

Notion of iterative refinement is applied to train a diffusion model capable of turning noise into beautiful unseen synthetic images.

Stable Diffusion, an open source project, focusses on generating diverse & high-quality images through the diffusion process.

A combination of multiple technologies, following are its major components:

1. The pretrained text encoder (Open AI’s CLIP).

2. The UNet Model as noise predictor.

3. Decoder part of the autoencoder-decoder network.

Do not be Scared by mathematical derivations in proofs, it will become clear as you progress.

Diffusion models have taken the throne as state-of-the-art generative models

Seeing is believing, try stable diffusion @https://stability.ai/

For developers, have a look at article by Jay Alammar’s @ https://jalammar.github.io/illustrated-stable-diffusion. A good place to start with Jeremy Howard’s free FastAI course can help clear mist .

Hugging Face’s Diffusers library is one such implementation. Comes with pre-trained models for generating images, audio, and even 3D structures of molecules.

You can also try & play with KerasCV, another such pre- trained implementation.

Wishing you a fun filled journey…

Contact me for further details or any clarifications.

]]>
https://amathur.org/2023/10/09/entropy-stable-diffusion-generative-ai/feed/ 0 191
Machine Learning or Learn from Machines ! https://amathur.org/2023/07/21/machine-learning-or-learn-from-machines/ https://amathur.org/2023/07/21/machine-learning-or-learn-from-machines/#respond Fri, 21 Jul 2023 04:44:26 +0000 https://amathur.org/?p=186 While trying to make machines learn, they learn from experience & make decisions accordingly. Magic of machines learning is, honesty, discipline & hard work (tons of data to scan).

They shine in almost all fields be it charting out treatment, detecting aberrations in X-Rays, CT scans, autonomous vehicles, list is endless. They shine because they examine minutely with precision, for example, pixel by pixel.

Reason for their success lies, in their learning process. Though human brain is far superior.

Will list reasons below

On contrary when I meet/ recruit freshers, young graduates, see these learning principles missing, as well zeal to learn.

Culprit could be our present day education system, lack of good faculty & plethora of engineering colleges all around that ingests students without judging their aptitude and interest. Look forward to someone feed them readymade solutions, lacks thinking & explorations.

They do not read books work on exercises included, analyse what they learn or how it can be used practically.

Maths is taught in sultry fashion without its practical application.

Mass of students are pushed in IT not because of their interest, but because of peer pressures & lack of opportunities elsewhere.

We are forgetting out ancient Gurukuls, exams & evaluation system & above all discipline.

These are points that makes machine learn.

May be its time to learn from machines, take a clue and improve our future workforce

  • Honesty & Discipline.
  • Minute observations of features, pixels. Detecting deviations and identify their impact on case.
  • Accept punishments & improve upon.
  • Are not scared of making mistakes, rather learn from them.
  • Application of High School Maths
  • Patience, Hard Work & Sincerity
  • Collaborate with each other
  • Not scared of Slow and Steady progress
  • Learn….& Keep Learning

These small steps if followed in any field can make the journey, a pleasure

]]>
https://amathur.org/2023/07/21/machine-learning-or-learn-from-machines/feed/ 0 186
Time Series Analysis vs Quantum Magic – Predictions https://amathur.org/2021/10/03/time-series-analysis-vs-quantum-magic-predictions/ https://amathur.org/2021/10/03/time-series-analysis-vs-quantum-magic-predictions/#respond Sun, 03 Oct 2021 04:26:06 +0000 http://amathur.org/?p=184 There’s a notable spurt in Time Series Analysis using Statistics & Deep Learning – Trying to predict “Future” in Health Care, Securities/Stocks, Traffic, Weather etc. Race against randomness.

Quantum computing is trying to decipher nuances of Qbits – take computing to next level.

I feel revival of erstwhile fictional “Star Trek” making life safe and better. Churning large amount of data captured via IoT Devices, Medical Instruments, Genes & Associated Mutations!

Time to study, integrate and merge independent data collected from healthcare, stocks, transport, astronomy. See if a Model emerges, which can Predict.

May be there’s a co-relation among them, not yet explored.

Statistics with Deep learning can then make out sense and make predictions.

George Box, a great statistician and author rightly said, “All models are wrong, but some are useful.”

Long back when I was exploring palmistry & astrology, Forewarned is Forearmed, was the motive!

Time is up now …

We can capture, store & process huge & humongous data to churn out a Meaningful Model.

Onus lies with Government & Private Sector to share and take initiative in connecting varied data across diverse departments/units.

Data Scientists & Engineers together can clean, massage, interconnect & feed it to a Neural Network.

For example, I was working on a project to make sense using AI – Using Geno & Phenotypic Data to identify disease like Cancer, but boom it was thwarted, randomness prevailed!

Further believe some magical power called “God” is managing grand show.

One day … we shall overcome…

]]>
https://amathur.org/2021/10/03/time-series-analysis-vs-quantum-magic-predictions/feed/ 0 184
Is Astrology & Palmistry Scientific? https://amathur.org/2020/06/20/is-astrology-palmistry-scientific/ https://amathur.org/2020/06/20/is-astrology-palmistry-scientific/#respond Sat, 20 Jun 2020 08:51:54 +0000 http://amathur.org/?p=174 I have been trying to decipher Astrology/Palmistry & Computers for almost same duration of time, close to 20 years.

One question is, are these two fields scientific. Can we use it to predict future events or difference in twin with almost same charts?

I thought its beyond realms of science as we know today, but I was wrong!

Truth is these two fields are as scientific as Quantum Physics/Computing and Genetics as we understand  them today.

Horoscope of a native is chart of planets position at the Time and Place person is born. Very much like Genome of a human. Position of planets and their houses is found by using same principles and formulas of physics/maths that are used by Astronomers or Met Department for predicting Lunar Cycles or Eclipses. I can vouch for it, as I developed similar package for Times of India Group long back.

When it comes to predictions streak of Uncertainty creeps in.

Mystery of which genes will mutate or manifest in what form is same as that of location of a quantum particle.

Similarly there’s uncertainty in making predictions from a horoscope or palm print. Not in computing/making of horoscope.

99.9% genome of two humans are same, it’s only 0.1% that makes all the difference.

Expressions of genes, Phenotype, is dependent on Genome + Environment. The environment is yet to be explored.

Part of 0.1% is responsible for diseases and related sufferings or any positive traits (by chance).

Same reasoning explains difference in twins with almost identical horoscopes.

Uncertainty in predicting genes manifestation is same in case of events from Horoscopes, Palm Prints.

Possibly when science will master Quantum Physics/Computing/Genetic Manifestation, we will be able to predict events from Horoscopes and use it for our benefit

]]>
https://amathur.org/2020/06/20/is-astrology-palmistry-scientific/feed/ 0 174
Final Frontier: Forms & Reports Oracle – 12c https://amathur.org/2020/03/08/final-frontier-forms-reports-oracle-12c/ https://amathur.org/2020/03/08/final-frontier-forms-reports-oracle-12c/#respond Sun, 08 Mar 2020 06:53:51 +0000 http://amathur.org/?p=169 My explorations of Oracle 12c Forms and Reports did not end with Installation and a cursory glance of its Forms and Report. This agony is captured in my earlier blog.

Looks another mystery was waiting to unravel, when I tried deploying 12c Report on a report server and invoke it via a Form.  This is practical way a report is invoked.

As per installation document, created a report server instance and named it “MyServer1 via WLST command, it was success. However,  when  tried to access it via getserverinfo option, it displayed port  binding error, REP-51002 Bind to Reports Server rep_server_name failed . This was not new; Oracle support had given solutions around opening port, 14021 via Window firewall.

This did not worked, thought of giving up this experiment on Windows 10 machine

But keen developer inside did not let me sleep .. (miles to go before I sleep !)

Finally HIS grace knocked; as always, I chanced to look at middle of YouTube video that advocated the report server name to embedded manually in “rwservlet.properties”, this is spread across 3 different folders, a search within Middleware folder will reveal.

Secondly reference to “securityId=”rwJaznSec” to be removed from all instances rwserver.conf files.

Finally Lady Luck smiled at sharp Mid Night, Report Server and its integration with Forms went off peacefully.

This gives an indication of Oracle 12c’s Forms and Report offering.

]]>
https://amathur.org/2020/03/08/final-frontier-forms-reports-oracle-12c/feed/ 0 169
Surprises – Trying Oracle 12c Forms & Reports on Win 10 https://amathur.org/2020/02/29/my-experiences-oracle-12c-forms-reports-on-win-10/ https://amathur.org/2020/02/29/my-experiences-oracle-12c-forms-reports-on-win-10/#respond Sat, 29 Feb 2020 14:45:29 +0000 http://amathur.org/?p=166 Oracle is a leader in providing database workhorses to big corporates and businesses worldwide. One thing I always liked about Oracle is good and extensive documentation.

But this time it was time for surprise.

One of product for database community, has been Forms, providing Client/Server based UI to accept and feed data to criss cross tables & Reports to fetch same and present in different (PDF etc.) ways.

This has been a success story for many years, however with passage of time and prominence of web, it added a layer of Fusion Middleware, a battery of WebLogic Servers in release 11g and now 12c. It offered a, JNLP/Applet based solution so that Forms and Reports be presented via Browser, instead of Oracle Clients.

Though this model is deprecated and unsupported by most browsers, except for IE.

However Oracle 12c Fusion Middleware still offers it for next couple of years.

One of the interesting features in 12c only, is a standalone launcher mode (FSAL). It lets developers to test Forms on their desktops without loading it via Browser.

Note It is not a typical web application not supported on Mobiles. Forms are launched via browser. Oracle recommend migrating Forms/Reports to its APEX framework to make it responsive as a web application.

Anyway, I wanted to try and experience traditional Oracle Forms 6i on Oracle 12c FM.

Oracle does not supports Mac as an OS for its DB and FM (12.2.x.y), so I opted for Win10 laptop.

This decision was for full of surprises and twists. Installation/ configuration went on peacefully, but Forms would not launch in FSAL/ Standalone mode and Report Builder would not start at all.

Only web based launching via IE 11 was up.

Searched various forms (oracle’s and others) without any clues.

Finally in one of old 11g support forums, responding to similar issue they accepted it as a defect in 11g to be fixed in subsequent releases, recommended to the full path including Middleware Home to be less than 64 characters on windows machines. This defect was not marked closed.

Assumed this should have been fixed in next generations of 12c, so ignored. Lost all my hopes struggling to make it work , finally removed it from laptop.

But still had a tweak of hope, so decided to give it another shot.

This time with all sub directory names of 2 characters, funny, it looked like “E:\or\mw\oh\up\bm\bd …”, this will keep entire tree small.

Voila, with some more undocumented surprises, everything worked as expected.

Interestingly this is not documented in official documents. I know of one case where it worked on a different version of Win.

Message of The Day for Developer Community

God Still Helps Those Who Help Themselves !

Never lose Hope. Specially when a giant like Oracle is involved.

]]>
https://amathur.org/2020/02/29/my-experiences-oracle-12c-forms-reports-on-win-10/feed/ 0 166
Journey of Discovering Anomaly In Oracle Linux & 18cXE DB https://amathur.org/2019/12/14/journey-of-discovering-anomaly-in-oracle-linux-18cxe-db/ https://amathur.org/2019/12/14/journey-of-discovering-anomaly-in-oracle-linux-18cxe-db/#respond Sat, 14 Dec 2019 09:05:00 +0000 http://amathur.org/?p=152 Last week, I wanted to lay hands on latest incarnation of Oracle 18c Express Edition, a free version.

Unfortunately, not for Mac, available for selective flavours of Linux and Windows. I had Ubuntu 18.4 on VirtualBox, could not install easily. Gave up.

Thought of laying hands on AWS cloud, it comes with RDS Oracle service. However, it does not offers 18cXP out of box. Various other flavours, expensive licensed versions are supported.

Developer instinct prompted me to try on a raw EC2 instance of Oracle Linux 7 (based on Fedora).

Since both OS and Database are from same vendor, expected it to be smooth affair.

Alas a nightmare was waiting !

Started with VPC, with both private and public Subnets and a Security Group. Installation of Database was smooth as per the documentation, connection worked fine from within the same instance.

Real challenge emerged, when tried to connect to it from SQL Developer on my Mac or any other ec2. Tried all hacks to sort it, but of no avail.

As a debug strategy to get an insight of issue, tried another instance of RHEL with an older, 11g version of XE worked.

After trials and changes, realised some firewall like issue, is preventing traffic from other hosts post installation of Oracle 18c XE.

Finally after manipulating firewall to allow traffic on specific port, could make it work!

Here’s the command for same:

// Assuming port # is 1521, similarly to allow access to EM and APEX, we may have to open ports 5500 and 8080 (or whatever configured)

sudo firewall-cmd –permanent –zone=public –add-port=1521/tcp
sudo firewall-cmd –reload

Installation of this version or database on OL, modifies firewall,

https://oracle-base.com/articles/linux/linux-firewall#iptables

Please feel free to share your experiences.

Happy Explorations

]]>
https://amathur.org/2019/12/14/journey-of-discovering-anomaly-in-oracle-linux-18cxe-db/feed/ 0 152